Structural insight into African swine fever virus I73R protein reveals it as a Z-DNA binding protein
Lifang Sun, Yurun Miao and Zhenzhong Wang contributed equally to this work.
Abstract
African Swine Fever (ASF) is a highly contagious viral haemorrhagic disease of swine, leading to enormous economic losses in the swine industry. However, vaccines and drugs to treat ASF have yet to be developed. African swine fever virus (ASFV) encodes more than 150 proteins, but 50% of them have unknown functions. Here, we present the crystal structure of the ASFV I73R protein at a resolution of 2.0 Å. Similar search tools based solely on amino acid sequence shows that it has no relationships to any proteins of known function. Interestingly, the overall structure of the I73R protein shares a winged helix-turn-helix fold, structural similarity with the Z-DNA binding domain (Zα). In accordance with this result, the I73R is capable of binding to a CpG repeats DNA duplex, which has a high propensity for forming Z-DNA during the DNA binding assays. In addition, the I73R protein was shown to be expressed at both early and late stages of ASFV post-infection in PAM cells as an 8.9 kDa protein. Immunofluorescence studies revealed that the I73R protein is expressed in the nucleus at early times post-infection and gradually translocated from the nucleus to the cytoplasm. Taken together, these data indicate that the I73R could be a member of Zα family that is important in host–pathogen interaction, which paves the way for the design of inhibitors to target this severe pathogen. Further exploring the biological role of I73R during ASFV infection in vitro and in vivo will provide new clues for development of new antiviral strategies.
Abbreviations
-
- ASF
-
- African swine fever
-
- ASFV
-
- African swine fever virus
-
- BSA
-
- bovine serum albumin
-
- DAPI
-
- 4′,6-diamidino-2-phenylindole
-
- DMEM
-
- Dulbecco's modified Eagle medium
-
- DSS
-
- crosslinker disuccinimidyl suberate
-
- EMSA
-
- electrophoretic mobility shift assay
-
- FBS
-
- foetal bovine serum
-
- IPTG
-
- isopropyl-β-D-thiogalactopyranoside
-
- PAM
-
- pulmonary alveolar macrophage
-
- PDB
-
- Protein Data Bank
-
- Se-Met
-
- selenomethionine
-
- SSRF
-
- Shanghai Synchrotron Radiation Facility
-
- Zα
-
- Z-DNA binding domain
-
- β-ME
-
- β-Mercaptoethanol
1 INTRODUCTION
African swine fever (ASF), a highly contagious fatal viral disease affecting both domestic pigs and wild boars of all ages, is the most severe re-emerging disease threatening the swine industry (Brookes et al., 2021; Galindo & Alonso, 2017; Sanchez-Cordon et al., 2019). ASF was first discovered in Kenya in eastern Africa in 1921 and has spread to Europe, America and Asia in the recent decades (Costard et al., 2009; Sanchez-Cordon et al., 2018; Sun et al., 2021). To date, there are no effective vaccines or drugs for the control of the disease (Coelho & Leitao, 2020).
ASFV is a member of the highly complicated nucleocytoplasmic large DNA virus (NCLDV) and belongs to the sole member of Asfarviridae (Alonso et al., 2018; Hernaez et al., 2016; Yutin & Koonin, 2012). ASFV particles have an average diameter of 250 nm and consist of a five-layer structure, including double-stranded DNA containing central nucleoid, core shell, inner lipid envelope, icosahedral capsid and outer lipid envelope (Andres et al., 2020; Liu et al., 2019; Wang et al., 2019). The genome varies from 170 to 190 kbp dependent on the virus strain and encodes more than 151 open reading frames (ORFs), including structural proteins for viral particle assembly, non-structural proteins for genome replication and evasion of host defences etc. (Dixon et al., 2013; Olesen et al., 2018; Reis et al., 2017; Rodriguez & Salas, 2013).
ASFV encodes a 72-amino acid protein, I73R, with an unknown function (Cackett et al., 2020). NCBI database searches for the ASFV I73R revealed no homology with any known or other functionally identified protein sequence. Considering its high levels of expression, it is likely important throughout infection, which makes it an interesting candidate as a potential drug or vaccine target.
In the present study, we determined the crystal structure of ASFV I73R. We observed that I73R employs a classical α/β architecture with three α-helices forming the core of the domain, with helices 2 and 3 forming the helix-turn-helix unit. A PDB search using Dali server indicated that I73R shares structural similarity with Z-DNA binding domains (Zα), which are specific for the left-handed conformation of nucleic acid duplexes including DNA, DNA/RNA hybrid and RNA (Ha et al., 2008; Lee et al., 2016; Schade et al., 1999a, 1999b; Schwartz et al., 1999). We demonstrated I73R has Z-DNA binding activity using a gel mobility shift assay and gel filtration chromatography. Based on structural comparison and Z-DNA binding assays, the results suggest that the I73R structurally and biochemically could be a member of Zα family. In addition, we found that the I73R could form a stable homodimeric/oligomeric state. We also found that the I73R could be translocated from the nucleus to the cytoplasm during ASFV infection. These data suggest a functional behaviour similar to those Zα domain-containing proteins (Ng et al., 2013).
2 METHODS
2.1 Protein expression and purification
The gene coding for the I73R from ASFV was cloned into the modified pET32a vector (Novagen) with an N-terminal 6His tag followed by a TEV protease cleavage site for thioredoxin tag removal at the EcoR I and Hind III restriction sites. Protein expression was induced in E. coli strain BL21 (DE3) cells with the addition of 0.3 mM isopropyl-β-D-thiogalactopyranoside (IPTG) for 12–16 h at 16°C. Bacterial cells were harvested by centrifugation and lysed by ultrasonication in a lysis buffer containing 25 mM Tris-HCl (pH 7.0), 500 mM NaCl, 5% glycerol, 20 mM imidazole, 0.1% Tween 20 and 2 mM β-Mercaptoethanol (β-ME). Overexpressed I73R-Trx-His6 was purified on Ni-NTA affinity resin (GE Healthcare) followed by overnight TEV protease cleavage at 4°C to remove the fused thioredoxin tag and N-terminal histidine. Protease and uncleaved protein were removed by a second binding to Ni-NTA affinity resin. The eluted protein was further purified by size-exclusion chromatography (Superdex™ 75 Increase 10/300. GE Healthcare) pre-equilibrated with 25 mM Hepes (pH 7.5), 150 mM NaCl and 5% glycerol.
For producing the selenomethionine (Se-Met, Sigma Aldrich) derivative of I73R, Ile46 was mutated to a Met to increase the ratio of Met. The Se-Met labelling mutant was grown in M9 minimal medium supplemented with 0.4% (w/v) glucose, 0.1 M MgSO4 and 0.01 M CaCl2. Approximately 30 min before induction with 0.3 mM IPTG, an additional solution of Se-Met (50 mg/L) and Leu, Ile, Val, Phe, Lys and Thr were added to the medium to inhibit the E. coli methionine pathway and force the incorporation of Se-Met. The Se-Met derivative of I73R was purified the same way as the native protein, with an enhanced reducing agent in the buffer (5 mM DTT) to prevent oxidation. Purified protein fractions were collected and concentrated to a final concentration of 10 mg/ml for crystallization.
2.2 Crystallization
Crystals of I73R and its Se-Met derivative were obtained using the sitting drop method at 16°C. The initial screens were carried out using commercially available crystallization screen kits including the JCSG Core I-IV, the classics suits from Qiagen and HR2-110, HR2-112 and HR2-144 from Hampton Research. The initial hits were further optimized to obtain diffraction-quality crystals. The best crystals of I73R were found with a reservoir solution of 0.2 M Li2SO4, 0.1 M Bis-Tris (pH 5.5) and 25% (w/v) PEG3350. The SeMet derivative crystals were also cultured in this condition. Well-shaped crystals were picked up in a CryoLoop, and flash-frozen at liquid nitrogen in their reservoir solution supplemented with 20% PEG400 as a cryoprotectant.
2.3 Data collection and structure determination
The Se-Met SAD data and the native data were collected at beamline BL17U of the Shanghai Synchrotron Radiation Facility (SSRF, China) (Wang et al., 2016). The 360° of data were collected with 1° increment and processed by XDS (Diederichs, 2016) in the space group R3 with unit cell parameters a = 89.36 Å, b = 89.36 Å, c = 46.16 Å, α = β = 90.0°, γ = 120.0°. Further processing was carried out using programs from HKL2000 page and CCP4 suite (Winn et al., 2011). Phasing was achieved by the SAD method using Se-Met derivative data. SHELXD was used to locate the positions of Se sites, and an initial model was built using the AutoSol program of the Phenix suite (Adams et al., 2010). The model was manually rebuilt with Coot (Emsley & Cowtan, 2004) and further refined in Phenix. The final refined model has an Rwork/Rfree ratio of 18.52/22.28 (see the Table 1 for statistics). The atomic coordination and structure factors for I73R have been deposited in the Protein Data Bank (PDB) under the accession code of 7VIV. Structure superimpositions were complemented by CCP4 LSQ superposition (Winn et al., 2011). Figures were produced with PyMOL program (DeLano Scientific LLC). Multiple sequence alignments were generated by ClustalW and ESPript. Interface surface area was calculated with the PISA server.
Data collection | Native | Se-Met derivative (I46M) |
---|---|---|
Space group | H3 | H3 |
Cell dimensions | ||
a, b, c (Å) | 89.36, 89.36, 46.16 | 91.90, 91.90, 47.62 |
α, β, γ (°) | 90.0, 90.0, 120.0 | 90.0, 90.0, 120.0 |
Resolution (Å) | 44.68–1.96 (2.01–1.96) | 45.68–2.70 (2.77–2.70) |
Rmerge (%) | 4.6 (87.5) | 9.4 (67.2) |
I/σI | 23.0 (2.5) | 15.2 (2.6) |
CC1/2 | 1.000 (0.811) | 0.998 (0.931) |
Completeness (%) | 99.9 (99.9) | 99.5 (99.7) |
Multiplicity | 10.1 (9.5) | 10.0 (10.1) |
Refinement | ||
---|---|---|
Resolution (Å) | 44.68–2.0 | |
No. reflections | 9266 | |
Rwork/Rfree (%) | 18.52/22.28 | |
No. atoms | 1280 | |
Water | 64 | |
R.m.s.d bonds (Å) | 0.010 | |
R.m.s.d angles (°) | 1.266 | |
Ramachandran plot | ||
Favoured (%) | 96.58 | |
Allowed (%) | 3.42 | |
Outliers (%) | 0.00 | |
Rotamer outliers (%) | 0.00 | |
Wilson B-factor (Å2) | 47.1 | |
R.m.s.d bonds (Å) | 0.010 | |
R.m.s.d angles (°) | 1.266 |
- Note: Numbers in parentheses refer to the highest-resolution shell.
2.4 DNA binding assays
The ability of the purified protein to bind DNA was evaluated by electrophoretic mobility shift assay (EMSA) according to the protocol of LightShift Chemiluminescent EMSA kit (Thermo Scientific) and size-exclusion chromatography. The DNA, d(CGCGCG)2 and d(CACGTG)2 duplex oligonucleotides used for EMSA all had 6 nt, plus a 5′ T overhang and biotin 5′ end-labelling. Controls were included in the assay to ensure the system is working properly. Mixtures of protein with 14 μM DNA at different molar ratios (0/1, 2/2, 4/2, 4/1 and 1/1) were incubated at 37°C for 20 min. The reactions were then subjected to electrophoresis on 6% native polyacrylamide gels with 0.5× TBE buffer and transferred to a nylon membrane. The biotin end-labelled DNA is detected using the Streptavidin-Horseradish Peroxidase Conjugate and the Chemiluminescent Substrate (Thermo Scientific). For the competition experiments, 50-fold molar excess of unlabelled probe was used for the reactions and incubated for 20 min before the labelled probes were added.
To confirmed the interaction, the purified I73R protein was incubated with the dT(CGCGCG)2 duplex DNA (DNA1) or the dT(CACGTG)2 duplex DNA (DNA2) respectively at the molar ratio of 1. After incubation 1 h, the mixtures were analysed by an S200 pre-packed column (GE Healthcare) pre-equilibrated with 25 mM Hepes (pH 7.5) and 150 mM NaCl, respectively.
Furthermore, the triplet of conserved residues (N44, Y48 and W68 of the I73R) in Zα domains were mutated to Ala, respectively, and then the ability of the variants to bind DNA were evaluated by EMSA according the above-mentioned protocol. In the EMSA, 56 μM of Wild-type I73R protein, N44A, Y48A and W68A variants was incubated with 28 μM biotin end-labelled DNA, d(CGCGCG)2, at 37°C for 20 min, respectively.
2.5 Size exclusion chromatography characterization
We used size exclusion chromatography to characterize the oligomerization state of the protein in solution. An S75 pre-packed column (GE Healthcare) was equilibrated with 25 mM Hepes (pH 7.5) and 150 mM NaCl and calibrated using protein standards (molecular mass: 6500–67,000 Da; Sigma-Aldrich). The purified protein in a volume of 500 μl was loaded on the column, and the elution profile was compared with that of the standards showing a single peak corresponding to a molecular mass of 8.9 kDa, in good agreement with the predicted molecular mass for the monomer. A similar procedure using the buffer used for crystallization revealed a novel peak at the expected molecular mass for the dimer.
2.6 Crosslinking assays
Purified protein was concentrated to 2 mg/ml in 25 mM Hepes (pH 7.5), 150 mM NaCl buffer. About 2 mg of crosslinker disuccinimidyl suberate (DSS, Thermo, Shanghai, China) was dissolved in 108 μl DMSO to a final concentration of 50 mM. Between 2- and 20-fold molar excess of the crosslinker was added to the protein sample according to the manufacturer's protocol. The reaction mixture was incubated for 30 min at room temperature and the reaction was quenched by addition of 50 mM Tris-HCl (pH 7.5). After incubation of the mixture for 15 min at room temperature, the oligomeric state of samples was examined by SDS-PAGE electrophoresis.
2.7 Cell culture and virus infection
Vero-E6 cells were cultured in Dulbecco's modified Eagle medium (DMEM, Gibco, USA) supplemented with 8% foetal bovine serum (FBS, PAN Biotech, Germany) and pulmonary alveolar macrophage (PAM) were cultured in RPMI 1640 medium (Yuanpei, China) with 10% FBS at 37°C in a 5% CO2 incubator. ASFV Anhui XCGQ strain (GenBank: MK128995.1) was propagated in PAM cells. Briefly, PAM cells were incubated with ASFV in serum-free RPMI 1640 medium for 1 h at 37°C, and then were washed with PBS and maintained with 10% FBS RPMI 1640 medium.
2.8 Antibodies and reagents
Mouse anti-Flag, rabbit anti-actin antibodies were purchased from Sigma. Mouse anti-I73R and mouse anti-p30 monoclonal antibodies were generated in our lab. HRP-conjugated goat anti-mouse IgG (H+L), HRP-conjugated goat anti-rabbit IgG (H+L), Alexa 555-conjugated goat anti-mouse IgG and Alexa 488-conjugated goat anti-mouse IgG antibodies were purchased from Millipore. Lipofectamine 2000 was purchased from Thermo Fisher Scientific.
2.9 Western blot assay
Cells were lysed in 2× SDS sample buffer [2% SDS, 10% glycerol, 60 mM Tris (pH 6.8) with 5% β-ME and 0.01% bromophenol blue]. The samples were denatured at 98°C for 10 min. The samples were subjected to SDS-PAGE electrophoresis and transferred to a nitrocellulose (NC) filter membrane (Pall Corporation, USA), and then the NC membranes were blocked with 3% skim milk in PBST (PBS with 0.5% Tween-20) for 30 min at room temperature, followed by incubation with primary antibodies overnight at 4°C, washing thrice with PBST, incubation with HRP-conjugated secondary antibodies for 6 h at 4°C and washing thrice with PBST. Finally, the membranes were reacted with ECL reagent and imaged with a UVITEC Alliance Q9 Advanced Imager (Uvitec, UK)
2.10 Immunofluorescence microscopy
PAM cells were grown on coverslips and infected with ASFV for the indicated time points. The cells were fixed and permeabilized with 4% formaldehyde and 0.1% Triton X-100 at 37°C for 30 min. After washing with glycine-PBS (0.02 M glycine in PBS), the cells were blocked with 3% bovine serum albumin (BSA) in PBS for 30 min at 37°C. The coverslips were incubated with primary antibody (1:500) for 1 h, washed twice with PBST, incubated with secondary antibody (1:500) for 30 min and washed thrice with PBS. The slides were stained with 4′,6-diamidino-2-phenylindole (DAPI) containing the anti-fade Dabco solution. The images were captured under Leica DMi8 microscope or Leica STELLARIS 5 confocal microscope.
3 RESULTS
3.1 The overall structure of the ASFV I73R
Using sequence homology searches for the ASFV I73R, no similar sequences were found. The sequence encoding I73R (residues 1 to 72) was cloned from ASFV, and the resulting protein was purified to homogeneity by nickel affinity chromatography and gel filtration after heterologous expression in Escherichia coli. The protein was crystallized by vapour diffusion, and the structure was solved by single wavelength anomalous dispersion with the variant (I46M) introducing Se-Met. Using the structure of the variant as a search model, the crystal structure of I73R was determined at a resolution of 2.0 Å in the H3 space group with an Rwork of 18.52% and an Rfree of 22.28% (Table 1). Each asymmetric unit was shown to contain two I73R molecules (Figure 1). Cell content analysis showed that the solvent content value was 41%, and the Matthews coefficient was 2.11. The two I73R form a stable dimer. The dimeric interface has a buried surface area of ∼771 Å2, accounting for 14.8% of monomer surface area and value 1.0 of its interface scored (PDBe PISA v1.52). The dimeric interaction was mainly contributed by 20 residues of each molecule. Each monomer adopts an α/β architecture with three α-helices (α1 to α3) packed against three antiparallel β-strands (β1 to β3) (Figure 1). The N-terminal helix α1 (Met1 to Lys16) is followed by the short-strand β1 (Leu21 to Thr22), which positions the C-terminal β hairpin (β2: Phe54 to Lys56; β3: Leu67 to Arg70) by means of two hydrogen bonds. Helices α2 (Ala23 to His34) and α3 (Thr40 to Ser49) reside between β1 and β2 and form the HTH motif. Aliphatic residues from the three helices, together with Trp68 in strand β3, interdigitate and form a hydrophobic core.

3.2 Structural comparison demonstrating I73R as a Zα domain
Despite the low sequence similarity of I73R with Z-DNA-binding domain (Zα) from human ADAR1 (hZαADAR1), goldfish PKZ (caZαPKZ), mouse DLM1 (mZαDLM1), yaba proxvirus E3L (yabZαE3L) and herpesvirus Zα-domain-containing protein (ORF112 Zα) (Figure 2a), the overall structure demonstrated a high degree of structural similarity with the prototypic ADAR1 Zα domain (Figure 2b). RMSDs of I73R and Zα from human ADAR1 (PDB: 1QBJ), gold fish PKZ (PDB: 4KMF), mouse DAI (PDB: 1J75), yaba proxvirus E3L (PDB: 1SFU) and Cyprinid herpesvirus 3 ORF112Zα (PDB: 4HOB) are 1.3, 1.3, 1.4, 2.0 and 1.4 Å, respectively (Figure 2b). Accordingly, these structural comparisons demonstrate that the I73R could belong to the family of Zα domains which have the unique property of specific binding to purine/pyrimidine repeats in the left-handed helical conformation known as Z-DNA (de Rosa et al., 2013; Ha et al., 2004, 2008, 2009; Kahmann et al., 2004; Schwartz et al., 1999, 2001). In addition, the structural analyses of hZαADAR1, caZαPKZ, mZαDLM1, yabZαE3L and ORF112Zα bound to the Z conformation of CG repeats DNA duplex revealed well-conserved interactions with the Zα domains (de Rosa et al., 2013; Kus et al., 2015; Lee et al., 2016; Tome et al., 2013). The core recognition interactions between Zα domains and Z-DNA have a triplet of conserved residues (Asn173, Tyr177 and Trp195 in hZαADAR1) maintaining almost identical conformations and interactions. Consistent with previous report (Schwartz et al., 1999), Tyr177, which is known as the most critical residue for Z-DNA binding by forming the CH-π stacking with the guanine 4 (G4), is conserved in structural and sequential aspects. In agreement with the Zα domains, this triplet of conserved residues (Asn44, Tyr48, and Trp68) in the I73R are also conserved in the corresponding position (Figure 2a). Structural differences between the Zα domains and I73R are largely restricted to the α1–β1 loop and the length of helix α2 and β2–β3 loop. Comparison with other Zα domain protein structures, the structural difference of the helices α2 and α1–β1 loop has no influence on Z-DNA binding (Figure 2a and b). These results, together with the limited structural alterations found in Zα domains, strongly demonstrated that the I73R is well conserved to the Zα domains and is a Zα domain protein.

3.3 Protein–DNA interactions
The structural analyses of hZαADAR1, caZαPKZ, mZαDLM1, yabZαE3L and ORF112Zα bound to the Z conformation of d(CGCGCG)2 revealed well-conserved interactions with the Zα domains. Zα domains with a winged helix-turn-helix motif bind to Z-DNA in a conformation specific manner. To determine whether the I73R protein could directly bind to CG repeats and non-CG repeat DNA duplex, the purified I73R and biotin-labelled DNA were generated for EMSAs. EMSAs with the purified protein against a dT(CGCGCG)2 duplex oligonucleotide showed a robust band shift (Figure 3a) in the corresponding position, confirming the ability of I73R to interact with CG repeats. In contrast, no specific complex was observed when I73R was incubated with non-CG repeat sequence dT(CACGTG)2 with a low propensity for forming Z-DNA (Figure 3a). The interaction between I73R and the DNA was further confirmed by size-exclusion chromatography. A new peak appeared when the protein was incubated with the dT(CGCGCG)2 duplex DNA (DNA1), suggesting a complex was formed, while no obvious peak shift appeared when the protein was incubated with the dT(CACGTG)2 duplex DNA (DNA2) (Figure 3b). Taken together, these results indicate that the I73R would prefer the CG repeats sequence and could be a member of Zα family and has the common feature of Z-DNA binding ability. Unfortunately, we have performed the co-crystals screen of I73R with DNA1, but no obvious co-crystal had been obtained.

The three residues, Asn44, Tyr48 and Trp68 in I73R, central to interaction with Z-DNA, are completely conserved within the Zα family. To confirm these conserved triplet residues (N44, Y48 and W68) of I73R involved in the target DNA binding, we introduced a point mutation of the amino acids and subsequently examined their effects on I73R and Z-DNA binding by the EMSA analysis. In the assay, compared with wild-type protein, the complex band of variant N44A with d(CGCGCG)2 was significantly weakened, while the complex band of variant Y48A with d(CGCGCG)2 was almost absent, and W68A was completely absent (Figure 4). It was indicated that N44A resulted in a drastic reduction in the affinity of DNA binding, while Y48A almost abolished and W68A totally abolished I73R and Z-DNA binding. This is in good agreement with the results obtained by the structural analysis and previously reported (Ha et al., 2004; Schwartz et al., 1999, 2001). So, the I73R site-directed mutagenesis experiments support the fact that the residues N44, Y48 and Y68 contribute to DNA recognition and binding, as revealed by other structures of Z-DNA and Zα complex such as observed in hZαADAR1/d(CGCGCG)2 complex (Ha et al., 2009).

3.4 Dimeric state of I73R
Two I73R molecules have been shown to form a stable dimer in the asymmetric unit. Therefore, the dimeric state of recombinant I73R was also investigated by gel filtration chromatography using a Superdex 75 Increase 10/300 GL column (GE Healthcare). Under the conditions [25 mM Hepes (pH 7.5), 150 mM NaCl, 5% glycerol), the protein eluted as a double-peak phenomenon, corresponding to the expected molecular mass of dimer and monomer (Figure 5a). The double peaks were shown to contain a protein of identical molecular weight in SDS-PAGE, suggesting that indeed they represent a dimeric and monomeric states of the same protein. Furthermore, the ratio of them shows the dimeric state of the purified protein is higher than that of monomeric state in solution. To further validate the dimeric formation, a chemical crosslinking assay was performed. Titration of the purified I73R at 2 mg/ml protein concentration with a crosslinker disuccinimidyl suberate (DSS) and analysis of 18% SDS-PAGE gel indicated that the dimeric state of I73R was present in solution (Figure 5b).

3.5 I73R is an early viral protein located in the nucleus
Although the I73R mRNA has been detected in a previous report (Alcami et al., 1990; Rodriguez & Salas, 2013), details of the protein expression levels and its subcellular localization are unknown. To determine the expression profile of the I73R protein, PAM cells, natural host cells of ASFV, were infected with ASFV for the indicated time points. The Western blot analysis showed that the expression of I73R was detected as early as 2 h post-infection (hpi) and gradually increased till 32 hpi, a similar expression profile as an early marker protein p30 (Figure 6) (Munoz-Moreno et al., 2015; Simoes et al., 2013). These data indicated that I73R could be expressed at the early stage of ASFV infection.

Since the above findings indicated that I73R is a Z-DNA-binding protein, the subcellular localization will be critical for it to exert biological functions. I73R or Flag-tagged I73R transfected Vero-E6 cells were immunostained by anti-I73R fluorescein isothiocyanate (FITC) or anti-Flag tetramethyl rhodamine isocyanate (TRITC) antibodies, respectively. The results showed that both exogenous I73R proteins are mainly localized in the nucleus (Figure 7a and b).

To confirm the subcellular localization of endogenous I73R during ASFV infection, the immunofluorescence assay was performed with PAM cells infected with ASFV-Anhui XCGQ. We found that ASFV expressed I73R protein during its infection could be localized in both nucleus and cytoplasm (Figure 8). The I73R was shown to be mainly detected in the nucleus at early-stage post-infection (∼ 8 hpi) (Figure 8a). However, the I73R could be translocated from the nucleus to the cytoplasm at late-stage post-infection (8–48 hpi) (Figure 8a). We showed that the percentage of cytoplasmic I73R was increased markedly in a time-dependent manner during ASFV infection (Figure 8b). Taken together, these data demonstrated that the I73R subcellular localization is dynamic during ASFV infection.

4 DISCUSSION
ASFV is highly contagious and can cause lethal diseases in both domestic pigs and wild boars. As ASFV continues to spread, the unavailability of effective prevention and treatment regimens highlights the importance of studying the structures and functions of critical viral proteins that may be used as targets for vaccine and drug design. To date, the crystal structures of some critical ASFV proteins have been reported, including pE165R and pA104R involved in maintaining genome fidelity during viral replication (Li et al., 2019; Liu et al., 2020), but the structures and functions of most ASFV-encoded proteins remain elusive.
Here we demonstrated a winged helix-turn-helix (HTH) structured domain, describing a new member of the Zα domain family found in the I73R protein of ASFV. The Zα domain is considered to specifically recognize left-handed nucleic acid duplexes such as DNA, DNA/RNA hybrid and RNA. Despite a relatively low sequence identity, structurally the I73R shares the resembling fold of Zα domains from hZαADAR1, caZαPKZ, mZαDLM1, yabZαE3L and ORF112Zα. Moreover, the three key DNA-interacting residues are conserved, Asn44, Tyr48 and Trp68 in I73R, and consistent with this, I73R firmly bind oligonucleotides bearing CpG repeats in vitro (Figures 3 and 4). Furthermore, the crystallographic structure of I73R forms a dimer, a new arrangement not previously observed for Zα domains (Figure 9a). Moreover, we found the surface of DNA interaction that has been accurately occupied by another I73R molecular in the crystal structure (Figure 9b). In contrast, ORF112 forms a stable dimer through domain swapping of the C-terminal strand, and sulphate ions occupying positions that accurately reflect the positions of DNA phosphates in the complex as well as at key hinge positions of the swapped loops. Intriguingly, we detected that I73R has been found in both monomer and dimer forms in solution. We postulate that the balance between monomeric and dimeric forms of I73R might modulate switching between the active and the inactive forms. The possible functional significance of this observation is currently unknown.

The left-handed structure of DNA (Z-DNA) is extremely unstable under physiological conditions and radically different from the canonical form of DNA, which follows the Watson–Crick model and is the right-handed double-helical structure of DNA (B-DNA) existing in normal physiological settings (Wang et al., 1979). Although Z-DNA is thermodynamically less stable compared to B-DNA, it can be stabilized by interactions with the Zα domain. Zα domain is found in a group of cellular proteins, such as DLM1, PKZ and ADAR1; those proteins encoded by the host cells are involved in the antiviral interferon response pathways (Poulsen et al., 2001; Rothenburg et al., 2002; Taghavi & Samuel, 2013). For examples, a PKR-like kinase from zebrafish named PKZ use this domain to promote immune responses (Rothenburg et al., 2005) and the nucleotide sensor DLM1 can drive kinase RIPK signalling and inhibit the replication of Zika virus (Daniels et al., 2019; Rothan et al., 2019). Interestingly two viral proteins containing Zα domain were identified to be crucial for viral innate immune evasion and pathogenesis. It has been shown that E3L protein encoded by vaccinia virus (VACV) include a Zα domain which is necessary for the inhibition of interferon (IFN) response; lacking of this domain can cause a significant decrease in VACV virulence (White & Jacobs, 2012). In addition, ORF112 from cyprinid herpesvirus 3 (CyHV-3), which is similar to E3L, localizes in stress granules and is involved in antiviral responses (Kus et al., 2015; Tome et al., 2013). In this study, we demonstrated I73R to be the third viral protein containing Zα domain. The structural similarities between the I73R and Zα-domain containing proteins suggest that the I73R might be involved in immune evasion processes. Recently, screening of ASFV-encoded proteins that regulate host gene translation showed that I73R was able to inhibit host gene expression (Shen et al., 2020). Therefore, whether I73R functions similarly as E3L and ORF112 in host immune evasion and viral pathogenesis and the mechanism of I73R modulates host gene expression are worthwhile be further investigated.
This study demonstrated that the I73R could be expressed at the early stage of ASFV infection. The early expressed viral proteins have been considered to regulate the translation of other proteins encoded by viruses or modulate the intracellular environment to benefit the virus replication, such as the regulation of interferon responses, cell proliferation or cell cycle arrest (Dixon et al., 2013; Salas & Andres, 2013). We also demonstrated that the I73R subcellular localization is dynamic. As shown in Figure 8, I73R is mainly localized in the nucleus at the early stages of ASFV infection. However, I73R could be translocated from the nucleus to the cytoplasm at the late stage of ASFV infection. It is well-known that ASFV replicates mainly in the cytoplasm, but ASFV also has an early intranuclear replication stage (Simoes et al., 2015). It has been reported that ASFV DNA cannot replicate in enucleated cells (Ortin & Vinuela, 1977) and the intranuclear viral DNA replication phase associates with alternations of nuclear architecture and epigenetic signatures, including disassembling the lamina network, redistributing subnuclear compartments and transcriptionally related nuclear factors and promoting heterochromatinization of host genome, probably controlling transcription, repressing host gene expression and favouring viral replication (Ballester et al., 2011; Knipe, 2015; Simoes et al., 2015). It has been shown that herpes simplex virus type 1 ICP27 is an active shuttling protein between the nucleus and the cytoplasm and can regulate the viral gene expression at both the early and late stages during infection (Mears & Rice, 1998). Therefore, it will be interesting to explore why ASFV encodes a protein which is located in the nucleus during the ASFV early replication cycle. Through protein mass spectrometry analysis, we found that I73R interacts with histone H2B, Interferon Regulatory Factor 4 (IRF4) and Kinesin family member 7 (KIF7). As one of four nucleosomal core histones, H2B is important for maintaining the structure of chromatin (Reche et al., 2006) and is also present in ASFV particles (Salas & Andres, 2013). IRF4 is a transcription factor that regulates group of genes critical for cell development and immune response (Gualco et al., 2010). KIF7, a ciliary motor protein belonging to the kinesin family, is involved in directing transportation of various cargos, such as membranous organelles, protein complexes and mRNAs (Dafinger et al., 2011). Therefore, it is possible that through its associated factors, I73R may play a role in nuclear organization during the initial phase of ASFV replication, gene expression and viral components transportation. Future studies to characterize potential I73R functions would provide an insight into the mechanism by which I73R regulates ASFV replication, immune evasion or pathogenesis.
5 CONCLUSIONS
In this study, we have identified a novel Zα-domain-containing protein I73R and demonstrated that key residues of the I73R are highly conserved to other Zα domain proteins. We have also demonstrated that the I73R has a behaviour similar to other Zα domain-containing proteins, such as ADAR1, PKZ and DAI. Therefore, further studies will be important keys to fully understand how the I73R protein contributes to the immune responses in ASFV-infected cells.
ACKNOWLEDGEMENTS
The authors would like to thank Dr. Lorne Babiuk from University of Alberta for manuscript writing and staff of beamline BL17U1 of Shanghai Synchrotron Radiation Facility (SSRF) for assistance with diffraction data collection.
COMPETING INTERESTS
The authors declare no competing financial interest.
ETHICS STATEMENT
All animal experiments were performed according to the rules of National Guidelines for Housing and Care of Laboratory Animals (China) and Institutional Animal Care and Ethics Committee of Nanjing Agricultural University (permit no. IACECNAU20160102). All mice were housed in the animal facility of Nanjing Agricultural University (Nanjing, Jiangsu, China).
AUTHOR CONTRIBUTIONS
L.S., Y.W. and Y.Q. conceived and designed the experiments. L.S. performed the crystal structure, DNA binding assays and crosslinking assays. Y.Q., X.W., Y.M. and Z.W. designed and performed the subcellular location of I73R during ASFV infection. L.S., Y.Q. and Y.M. analysed the data and drafted the manuscript. Y.W., X.M. and Y.Q. revised the manuscript. All authors have read and approved the final manuscript
Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.