Antigenic variation in Babesia bovis occurs through segmental gene conversion of the ves multigene family, within a bidirectional locus of active transcription
Summary
Antigenic variation in Babesia bovis is one aspect of a multifunctional virulence/survival mechanism mediated by the heterodimeric variant erythrocyte surface antigen 1 (VESA1) protein that also involves endothelial cytoadhesion with sequestration of mature parasitized erythrocytes. The ves1α gene encoding the VESA1a subunit was previously identified. In this study, we present the unique organization of the genomic locus from which ves1α is transcribed, and identify a novel branch of the ves multigene family, ves1β. These genes are found together, closely juxtaposed and divergently oriented, at the locus of active transcription. We provide compelling evidence that variation of both transcriptionally active genes occurs through a mechanism of segmental gene conversion involving sequence donor genes of similar organization. These results also suggest the possibility of epigenetic regulation through in situ switching among gene loci, further expanding the potential repertoire of variant proteins.
Introduction
Babesia bovis is a tick-borne, intraerythrocytic, protozoal parasite of cattle that shares many lifestyle parallels with the most virulent of the human malarial parasites, Plasmodium falciparum (Allred, 1995). The acute infection is severe, with numerous pathologic sequelae that may include a cerebral form of the disease associated with endothelial cytoadhesion and parasite sequestration (Wright, 1972; Aikawa et al., 1992). Animals that survive establish strong immunity to further disease and appear phenotypically normal (Wright et al., 1988). Despite host immunity to disease, B. bovis consistently establishes persistent infections of extremely long duration (Curnow, 1973; Mahoney et al., 1973; Allred et al., 1994; Calder et al., 1996). At least two mechanisms are involved in this: endothelial cytoadhesion and antigenic variation (Allred et al., 1994; O’Connor et al., 1999).
Antigenic variation sensu stricto, a rapid alteration in the structure and antigenicity of specific pathogen-synthesized components occurring in the course of a single infection, has been observed in several parasitic protozoa, as well as numerous bacterial species (Deitsch et al., 1997; Borst, 2003). In B. bovis, variation at the surface of infected erythrocytes was previously established and shown to involve the heterodimeric variant erythrocyte surface antigen 1 (VESA1) protein (Allred et al., 1994; O’Connor et al., 1997). Studies on isogenic clonal parasite lines indicated that the two VESA1 subunits, 1a and 1b, differ antigenically, as well as in size among the clones (Allred et al., 1994; O’Connor and Allred, 2000). Although available evidence suggests that these polypeptides are encoded by two distinct genes, their similar amino acid (aa) compositions and even sharing of some proteolytic digestion fragments suggests genetic relatedness but not identity (O’Connor et al., 1997). Furthermore, as the two subunits appear to vary in parallel (Allred et al., 1994; O’Connor et al., 1997), it is also conceivable that they may be genetically linked and/or their expression co-ordinately controlled.
It has been proposed that proteins which undergo antigenic variation must also serve some additional function (Turner, 2002). In the case of B. bovis, one additional known function of VESA1 is to cytoadhere to the capillary and post-capillary venous endothelium, a behaviour shared with P. falciparum (Allred, 1995). In P. falciparum, cytoadhesion occurs at least in part via the erythrocyte membrane protein 1, PfEMP1 (Baruch et al., 1996; 1997), a protein encoded by the var multigene family (Baruch et al., 1995; Smith et al., 1995; Su et al., 1995). Variation in PfEMP1 during a single infection appears to occur primarily through epigenetic in situ switching of the transcriptionally active var gene, apparently without alteration of the involved genes. Furthermore, most, if not all, the var genes have the potential to be transcribed in situ (Smith et al., 1995; Scherf et al., 1998). This stands in contrast to many other parasitic protozoa, such as the African trypanosomes, where antigenic variation of the variant surface glycoprotein (VSG) may occur through several mechanisms. Among these, the most common involve gene conversion-like mechanisms resulting in the replacement of a complete, expression site-linked VSG gene, or of gene segments, with formation of mosaic genes (Roth et al., 1989; Kamper and Barbet, 1992; reviewed in Borst et al., 1993; Borst and Ulbert, 2001).
Because of the involvement of VESA1 in both the virulence and persistence of B. bovis, it is imperative to understand the mechanisms behind variation of this protein and how that affects its functions. Mouse monoclonal antibodies to the VESA1a subunit expressed by the B. bovis C9.1 clonal line (clonal line derivation is described in Fig. S1) previously were used to recover the ves1α gene encoding this polypeptide from a C9.1 line cDNA library (Allred et al., 2000). The ves1α gene was found to be part of a large multigene family, ves, most members of which appear to be invariant over time among several isogenic parasite lines. In contrast, the locus from which this gene is transcribed was found to be rapidly modified in a fashion suggestive of progressive, segmental gene conversion. However, alternative possible explanations existed, and nothing was known regarding mechanisms controlling VESA1b variation or even the identity of the gene encoding this subunit.
The VESA1a polypeptide is somewhat unusual in its structure. VESA1a (and presumably VESA1b) must be transported to the infected erythrocyte plasma membrane where it is inserted and exposed on the exoplasmic face (Allred et al., 1993). However, no N-terminal leader peptide is present in this protein, and the mechanism mediating trafficking to the erythrocyte membrane is unclear. Despite variability in primary structure, each variant of VESA1a observed to date possesses a common overall structure, including a large extracellular segment with a ‘cysteine- and lysine-rich domain’ (CKRD), a ‘variant domain conserved sequences’ (VDCS) domain, and a pair of predicted coiled-coil domains. This is followed by a single predicted transmembrane (TM) domain, and a short putative cytoplasmic domain (Allred et al., 2000; additional unpublished data). No significant homologies with existing proteins or with known binding motifs have yet been identified.
To better understand the mechanisms underlying variation of the VESA1 protein, we have here identified and validated the genomic locus of active ves1α transcription (LAT). Additionally, inter- and intrachromosomal duplication of several sequences from apparently transcriptionally inactive gene copies into the actively transcribed ves1α gene is demonstrated. Our results provide compelling evidence for the involvement of a segmental gene conversion mechanism in B. bovis antigenic variation. Further, the actively transcribed ves1α gene was found divergently paired with an actively transcribed, related but clearly distinct gene. We have called this gene ves1β and hypothesize that it encodes the VESA1b subunit. As transcriptionally inactive sequence donor gene copies were found to be similarly organized, with highly conserved intergenic regions (IRs), we further hypothesize that low-frequency epigenetic in situ switching of transcriptional activity from the current LAT to another gene pair is possible.
Results
Genomic organization of the LAT
To determine the overall structure of the LAT, we first defined the complete structure of the expressed ves1α gene. Previously, sequences containing the entire ves1α open reading frame (ORF), and complete 3′- and partial 5′UTRs were obtained from a C9.1 clonal line-derived cDNA, p9.6.2. A putative transcriptional start site was identified by 5′RACE, and comparisons with genomic sequences obtained by polymerase chain reaction (PCR) (clone pgC3c) identified two putative introns near the 5′ end (Allred et al., 2000). We have now confirmed the presence and locations of the introns by Southern blot analyses (data not shown) and sequencing. As a framework for further characterization of the LAT, we have compiled a composite ves1α gene from these sequences, and designate this the C9.1ves1α gene (Fig. S2).
To obtain the entire LAT, high complexity cosmid libraries of B. bovis C9.1 genomic DNA (gDNA) were initially screened using p9.6.2-derived oligonucleotide (oNt) probes. As this strategy was not successful, these probes instead were used to recover 3′- and 5′-extending LAT sequences from λZapII libraries of C9.1 line gDNA EcoRI fragments (for details see Fig. S2). Contiguity and continuity between EcoRI fragments was established by homology with already known, overlapping cDNA sequences, and confirmed by Southern blot analyses using oNt probes specific to this locus (Fig. 1A and B). One fragment containing the C9.1ves1α 5′-most and additional upstream sequences (phagemid 6-1, Fig. S2), unexpectedly contained what appeared initially to be the 5′ end of another ves1α gene, in a divergently apposed orientation. Additional 5′ LAT sequences containing the remainder of this gene were obtained from a λZapII gDNA library, by the same strategy (phagemid 3-2-1, Fig. S2). These, too, were validated by recovery of the corresponding cDNA sequences (phagemid 5-1.B73, Fig. S2) and Southern blot analyses with locus-specific oNt probes. This strategy yielded the complete LAT and immediately flanking regions, contained within 17.2 kb of continuous genomic sequence (Fig. 1C).

Structure of the LAT in the B. bovis C9.1 line. A and B. Southern blot analyses validating the LAT structure. gDNAs from the C9.1, B9, C8 and MO7 lines, and DNAs from the p9.6.2 ves1α cDNA and the 3-2-1, 6-1, 5-2-2 and 22-1 gDNA clones from which the structure was derived (Fig. S2), were digested with HincII or EcoRI and probed sequentially with the indicated oNts. In the experiment shown the C9.1 line sample did not cut to completion with EcoRI. The positions of the oNts and restriction sites are shown in (C). Arrows: C9.1 line transcription-associated copy of each oNt sequence. C. Schematic representation of the LAT. The upstream segment containing the C9.1ves1β gene is shown below the C9.1ves1α gene, connected by a stippled line. The orientations of ves coding sequences are indicated by horizontal arrows, and ATG start and TAG stop codons are noted, where present. Wide boxes represent exons; numbered pink boxes, introns. In ves1α genes, a blue box represents exon 1; black box, exon 2; and hatched green box, exon 3. The C9.1ves1β exons are graduated-green boxes. Downstream of the C9.1ves1β gene, a fragment resembling 5′ves1α sequences is present (beginning with a hatched blue box), but may represent a pseudogene. Right angle, double-ended red arrows indicate starts of the α1/α2 and β1/β2 blocks. Thin, solid lines indicate the IR, 3′UTRs, or other non-coding, non-repetitive sequences. Tandem repeats downstream of ves 3′UTRs are depicted as red (α repeats) or graduated-red boxes (β repeats). The tandem repeat regions Rep23, Rep43 and Rep45 are indicated. The oNt sequences used in (A) and (B) are noted, as are EcoRI (E) and HincII (H) restriction sites. D. Consensus sequence of the tandemly repeated units in Rep-23, -43 and -45. The number of repeat units in each is indicated.
When viewed as a whole, the structure of the LAT is quasipalindromic, containing both the known C9.1ves1α gene and a second, related but structurally distinct gene, which we call C9.1ves1β. The two genes are in a tightly juxtaposed, divergent orientation, with a narrow IR of only 433 bp separating their ATG start codons (1, 6). Comparison of the gDNA sequences with corresponding cDNAs both demonstrated its active transcription, and identified the intron/exon structure of C9.1ves1β. Although clearly related to C9.1ves1α in many intermittent segments along the length of the gene (Fig. 2), the LAT-resident C9.1ves1β gene contains 12 introns distributed along its length and ranging in size from 37 to 162 bp. Each gene is flanked at its 3′ end by a stretch of short, imperfect tandem repeats, which differ in sequence between the two genes (Fig. 1C). Intriguingly, these are followed by tandemly repeated sequence blocks strongly resembling the 3′ end of the respective gene. Blocks α1 and α2 begin with ≈ 250 bp of the 3′ end of ves1α coding sequence, followed by 3′UTR sequences. Whereas block α1 contains the above mentioned tandem repeats, block α2 does not (Fig. 1C). Similarly, the β1 and β2 blocks start with variable sequences from within the 3′-most intron. In the β1 block, this is followed by a terminal ves1β exon, 3′UTR, and a short stretch of the tandem repeats. Block β2, on the other hand, lacks the 3′-most exon sequences (Fig. 1C). Its intron sequences are followed by ves1β 3′UTR-like sequences.

Alignment of the IRs located between several ves genes. Shown are cosmids S621 and 1E10, the C9.1 LAT (phagemid 6-1), and phagemids 15-1, 1-1 and 10-1 (see Fig. 5A). Right angle, double-ended arrows denote putative start codons and orientation of each gene in a pair. The extent of sequence conservation is indicated by different colour shading: brown, 100% identity among all sequences; red, 70–90%; orange, 50–70%; and yellow, 30–50% identity. Non-conserved residues or those present in < 3 sequences are in lower case. Lines indicate the two quasipalindromic segments located ≈ 65 bp from the respective ATG start codons, with thickened segments highlighting the conserved inverted repeats (A/B, B′/A′) within each segment; asterisk, the putative C9.1ves1α transcription initiation site; BAK6, oNt BAK6.

The C9.1ves1β-encoded polypeptide shares several signature sequences with the VESA1a polypeptide. A. Maps of the C9.1ves1α and C9.1ves1β genes, highlighting similarities of the encoded polypeptides. The ves ORFs are indicated by horizontal arrows; ATG start and TAG stop codons are noted. Wide boxes represent sequences encoding: open boxes, extracellular domain; black box, the TM; and hatched box, putative cytoplasmic domain. Narrow grey boxes denote introns, and thin, solid lines the 5′-and 3′UTRs. Shown are the locations and boundaries of the CKRD and CTC domains; the VESA1a VDCS-1 and -2, and putative coiled-coil (blue boxes) domains; and ves1β LCV domain. Orange highlights signature sequences (roman numerals). B. Signature sequences shared between ves1α and ves1β branches of the ves family. Highly conserved signature sequences dispersed among otherwise distinct sequences are shown, numbered I–XV. Their aa positions are noted, using C9.1ves1α- (above) and C9.1ves1β-encoded (below) sequences for reference. Identical residues are highlighted in black. Solid grey line, predicted TM domains; dashes, non-conserved flanking sequences (not to scale with regard to length).
In addition to the tandem ves 3′ sequence-related blocks, the LAT-resident C9.1ves1α gene is followed by three independent series of short, tandem repeats. Based on the predominant lengths of individual repeat units within each series, we have designated these as ‘Rep 23’, ‘Rep 43’ and ‘Rep 45’ (Fig. 1C and D). Because of the low sequence complexity in this region of the LAT, there are only rare Sau3A/MboI recognition sequences. This fact, along with potential instability of the tandemly repeated sequences, undoubtedly contributed to underrepresentation of the LAT in the cosmid libraries. Simple tandem repeats such as these are absent from the region following the blocks at the 3′ end of the C9.1ves1β gene. Instead, a gene fragment containing ves1α exon 2, intron 2 and partial exon 3 sequences, in the same orientation as the C9.1ves1β gene (‘ves1α’ ORF, Fig. 1C), is present. Although a potential start codon and several possible splice sites for an intron exist, the sequences 5′ to exon 2, including the potential 5′UTR, bear no resemblance to the highly conserved ves1α, or even to ves1β 5′ sequences. This gene is likely to be a ves1α pseudogene.
The ves1β gene and gene product
Based on similarities and differences with ves1α, we have defined the ves1β gene as a new branch of the ves multigene family. The multi-intronic structure of the C9.1ves1β gene is shared by other ves1β genes, and contrasts starkly with the two consistent 5′-proximal introns in ves1α genes (below). However, the location and number of introns in this branch appears to be variable, suggesting their rapid, ongoing expansion and evolution. Only intron 1 corresponds well in location in both gene types. Further, translation of the C9.1ves1β ORF yields a polypeptide of 1122 aa residues, with a predicted mass of 115.5 kDa. Alignment of this product with that of the C9.1ves1α gene reveals 36.6% identity and 44.0% similarity. Embedded within these disparate sequences are numerous short signature sequences (I-XV in Fig. 2 and Fig. S3) that share high similarity or identity with consensus VESA1a sequences. One of these segments, designated signature motif ‘V’, contains a core sequence, TIREILYWLSALPYS, that has been invariant at the aa level in every ves-related gene of either type identified to date (Fig. S3) and may provide a reliable signature for the entire family.
The C9.1ves1β-encoded polypeptide possesses a structure that shares many similarities with that described for VESA1a (Allred et al., 2000), despite the dramatic difference in their exon/intron structure. Like VESA1a, the ves1β-encoded polypeptide lacks an identifiable N-terminal leader segment. Similarly, this polypeptide has a CKRD in the same relative position, although it is considerably reduced in overall length and C content relative to VESA1a (Fig. 2A and Fig. S3). Of potential significance is the presence within this minimal CKRD of a putative epidermal growth factor (EGF)-like domain, matching the EGF domain consensus motif ‘CXCX(5)GX(2)C’ identified from the ProSite database (Hofmann et al., 1999). A single TM with high homology to that of VESA1a is predicted (Sonnhammer et al., 1998) near the C-terminus of the protein (Fig. 2), preceding a short putative cytoplasmic domain (Fig. 2A). Comparison of the C9.1ves1β-encoded polypeptide with VESA1a also revealed a previously unappreciated ‘C-terminal cysteine-rich’ (CTC) domain (Fig. 2A and Fig. S3). Although present in each, the CTC of the C9.1ves1β product is highly enriched in G residues (25.0% in phagemid 5-1.B73), whereas little is present in the VESA1a CTC. Each contains well-conserved sequence motifs, but these are branch-specific (Fig. S3). Despite these similarities, there are also key differences. The VDCS-1 and -2 motifs and predicted paired coiled-coil domains of VESA1a are absent from ves1β-encoded polypeptides (Fig. 2A). On the other hand, comparisons of the translated products of several ves1β genes in this study revealed a ‘low complexity variant domain’ (LCV) in the centre of the polypeptide that is absent from VESA1a (Fig. 2A and Fig. S3). Although highly variant in sequence among the products of different ves1β genes, the LCV is comprised largely of A (16.5%) and G (14.9%) residues, which are often present as dialanine and diglycine repeats. Embedded within the LCV is the conserved motif ‘KCX(2)HX(0−1)KX(4−5)RC’ (Fig. S3). These structural differences are consistent among all VESA1a and ves1β-encoded polypeptides observed to date (Fig. S3) and, in concert with differences in gene structure, warrant both the inclusion of this gene in the ves multigene family and its designation as a branch distinct from ves1α.
Transcribed ves genes are modified through inter- and intrachromosomal gene conversion
Babesia bovis has a relatively small genome, possessing only four chromosomes approximately 1.4, 2.0, 2.8 and 3.2 Mbp in size (Ray et al., 1992; Jones et al., 1997). In this study, a size polymorphism of chromosome 4 was observed among the clonal lines, probably due to instability during in vitro culture (Fig. 3C). To determine which chromosome contained the C9.1 LAT, oNt probes were identified that annealed to the actively transcribed C9.1ves1α and -1β genes, or to flanking sequences (Fig. 3A). Oligonucleotide BAK72, from a region 3′ to the C9.1ves1α gene, anneals to a single band on Southern blots of B. bovis gDNAs (Fig. 3B), and was used to unambiguously detect the LAT. On a pulsed-field gel blot of B. bovis chromosomes, BAK72 clearly hybridizes only to chromosome 1 in each of several clonal lines examined (Fig. 3C). In contrast, oNt probes BAK52, BAK18 and BAK80, which are found within the LAT-resident genes, each hybridizes on gDNA Southern blots to two bands in the C9.1 line. In each case, one band matches the structure of the LAT, the other corresponds to a unique donor gene (Fig. 3A and B). When used to sequentially probe the same chromosomal blot, each of these oNts also hybridized with chromosome 1, confirming this location of the LAT. The second band hybridizing with BAK52 was also present on chromosome 1, whereas those hybridizing with BAK18 and BAK80 were on chromosomes 3 and 2 respectively (Fig. 3C). Further, BAK18 hybridized with two EcoRI bands only in the C9.1 line where the LAT sequences were identified, but not in several related isogenic lines (Fig. 3B). Instead, all lines hybridized with a single size-constant band > 10 kb in size. BAK52 hybridized with two bands in C9.1 and its B9, C8 and H10 progeny, but not its MO7 progenitor. However, the band associated with the LAT in C9.1 was size-polymorphic in each of the other parasite lines, indicating variability. BAK80 similarly hybridized with the B9 (Fig. 3B), C8 and H10 (data not shown) progeny, but not the MO7 progenitor or CD7 sibling lines.

Inter- and intrachromosomal segmental gene conversion events contribute to the creation of novel ves genes. A. Schematic representation of the LAT in the C9.1 clonal parasite line. All available sequences are not shown. Noted are the positions of the oNt probes used in (B) and (C). Other labelling is as in Fig. 1. B. Southern blots demonstrating copy number and locus specificity of oNt probes. DNAs include: gDNAs from the B. bovis clonal lines C9.1, B9, C8, H10, CD7 and MO7; C9.1 gDNA phagemid clones 6-1 and 5-2-2; and C9.1 ves1α cDNA, p9.6.2. (Fig. S2). DNAs were digested with HincII or EcoRI, and probed sequentially with BAK72 and BAK52 (left panels) or BAK18 and BAK80 (right panels). Arrows highlight C9.1-unique fragments containing duplicated sequences corresponding to either the expressed C9.1ves1α gene (BAK18 and BAK52) or the actively transcribed C9.1ves1β gene (BAK80). C. Pulsed-field gel analysis of whole B. bovis chromosomes. The lines C9.1, B9, C8, H10, MO7 and CD7 were analysed. The four B. bovis chromosomes are indicated, as are the H. wingei (H.w) and S. cerevisiae (S.c) chromosome markers. The gel was ethidium bromide stained (EtBr), then blotted and sequentially probed with the indicated oNt sequences under the same stringency as in (B).
In addition to providing validation of the structure and location of the LAT, the Southern blot hybridization experiments provided results which suggest that the locus from which ves sequences are transcribed is quite unstable and is comprised of sequences duplicated from more stable loci, common to the related clonal lines (1, 3). This result, which is consistent with observations made upon the initial identification of the ves1α gene (Allred et al., 2000), was suggestive of a gene conversion-like mechanism. To determine whether this interpretation was correct, loci which had donated some of the duplicated sequences were characterized. To do this, LAT-associated oNt probes were used to screen cosmid libraries of B. bovis C9.1 line gDNA. Probes which hybridized with only two bands, including an apparently stable locus, were chosen from sequences within the actively transcribed C9.1ves1α gene (Fig. 4). Unlike the LAT, this approach enabled the ready recovery of donor genomic loci from the cosmid libraries. The structures of the donor loci were validated relative to the genome by Southern blotting (Fig. 4B), and sequenced. Cosmid 1E10 contained the locus from which BAK18 originated. This locus resembles the LAT, containing the full-length BAK18 donor ves1α gene, ves1αC, divergently apposed to what may be a ves1β pseudogene missing extreme 3′ sequences (ψ-ves1βD in Fig. 4A). Alignment of its sequence to the C9.1ves1β gene and corresponding cDNA, clone 5-1.B73 (Fig. S2), identified 11 potential introns in ψ-ves1βD, with remarkably conserved exon/intron boundaries, although intron 1 only is extremely highly conserved in its location, sequence and length. However, despite obtaining 9.4 kb of sequence downstream of the putative ψ-ves1βD ATG start codon, sequences encoding the ves1β CTC, TM and cytoplasmic domains could not be found in this gene. Surprisingly, the DA32 and DA51 sequences were found to originate from two divergent, closely juxtaposed, full-length ves1α genes (ves1αA and ves1αB, respectively) within the same donor locus. This entire locus was recovered from each of two libraries, in cosmids S621 and M5.1 (Fig. 4A). Gene-type specific tandem repeats and blocks such as those present downstream of the LAT-associated genes (Fig. 1C) were not found associated with either of these donor loci.

The organization of ves sequence donor genes resembles that of the LAT. A. Maps of the C9.1ves1α gene from the LAT and two donor loci. The DA32/DA51 donor locus in cosmids S621 and M5.1, and the BAK18 donor locus in cosmid 1E10 are shown. Subclones from which the completed cosmid sequences were derived are depicted below each locus; the names signify cloning site (K, KpnI; N, NsiI; X, XbaI) and size in kb. The orientation of the ves ORF in each gene is noted by horizontal arrows. In ves1α genes, open boxes represent exon 1; black boxes, exon 2; and hatched boxes, exon 3. The predicted 12 exons of the ψ-ves1βD gene are shown as graduated-grey boxes. The sizes of IRs are shown in bp. Vertical arrows, oNts DA32, DA51, BAK18 and BAK6 (used in Fig. 5B); K, KpnI. Sequences outside the line breaks in the cosmid maps have not been determined. B. Southern blots demonstrating duplication of sequences into the LAT. gDNAs from the C9.1, B9, H10, MO7 and CD7 clonal lines, and DNAs from the p9.6.2 cDNA and cosmid M5.1 were digested with KpnI and probed with DA32 and DA51 sequentially, or with BAK18 (different blot). Arrows: C9.1-unique fragments containing the duplicated oNt sequences and corresponding to the expressed C9.1ves1α gene.
The genomic organization of the ves multigene family frequently resembles the LAT
The divergent organization of ves genes within the LAT was mirrored in the two cloned donor loci. To ask how consistent this phenomenon is, two types of experiments were performed. First, the oNt probe BAK6, derived from the LAT IR (Fig. 4A), was used to screen B. bovis C9.1 gDNA EcoRI fragment libraries. Based on signal intensities, BAK6 hybridizes in all lines tested to one band of ≈ 3.1 kb, and at least two each of ≈ 1.8 kb and ≈ 4.0 kb (Fig. 5B). In the C9.1 line, an additional band of ≈ 1.8 kb is present, corresponding to the LAT (Fig. 5A). This screen yielded three sequences of ≈ 1.8 kb, two of which were essentially identical (represented by phagemid 6-1, Fig. S2) and corresponded to the LAT, and the distinct phagemid 15-1 clone. One sequence each of the ≈ 3.1 kb (phagemid 1-1) and ≈ 4.0 kb (phagemid 10-1) segments were also recovered. In each case, the sequences obtained represented two ves gene fragments oriented as a divergent pair. Based on similarities to known ves1α and ves1β genes, each locus appears to contain a ves1α (ves1αF, -H and -J) and a ves1β (ves1βG, -I and -K) gene fragment (Fig. 5A). In the second approach all available ves sequences were aligned, and used to design ves1α-specific (BAK74 and BAK75) and ves1β-specific (BAK73) oNt probes (Fig. 5A). When used to sequentially probe the same Southern blot of EcoRI-digested C9.1 gDNA, each of these sequences hybridized to multiple fragments (Fig. 5C). However, the ves1β-specific BAK73 sequence appears to be present in considerably fewer copies. It should be noted that our assumptions about gene-type specificity and conservation for all three oNts are based on the sequences of relatively few genes, and may be somewhat inaccurate. Nonetheless, as these analyses were performed at high stringencies, it is clear that the ves multigene family contains a large number of ves1β genes. On quantitative dot blots of gDNA, these probes provided an estimate of ≥ 350 ves1α genes and ≥ 80 ves1β genes (data not shown). Further, a large proportion of fragments of all sizes appear to hybridize with both probe types, suggesting a close association of their sequences. Notably, many of these shared fragments also hybridize to BAK76 (Fig. 5A), a degenerate oNt representing the intergenic inverted repeats discussed below.

The ves multigene family consists of a large number of ves1α and ves1β genes, many of which are divergently organized. A. Maps of cosmid and phagemid clones containing closely juxtaposed ves sequences. All available cosmid 1E10 and C9.1 LAT sequences are not shown, as indicated by arrowheads at each end. Thick grey lines, ves1α exons; thick black lines, ves1β exons; thin lines, introns; dotted lines, IRs; and dashed lines, other non-coding sequences. The orientations of ves ORFs are indicated by horizontal arrows; vertical lines represent start and stop codons. All loci are oriented the same with regard to IR asymmetry. IR lengths are indicated in bp, and palindromic inverted repeat regions within are denoted by solid black triangles. Open triangles, the BAK6 oNt used in (B). Also noted are the BAK73–76 oNts used in (C), and EcoRI (E) restriction sites. B. Southern blot of EcoRI-digested MO7, CD7, B9 and C9.1 line gDNAs, p9.6.2 cDNA, and cosmid S621, probed with BAK6. The > 20 kb fragment representing the DA32/DA51 donor locus in cosmid S621 is missing in the MO7 and CD7 lines, likely due to a deletional event during in vitro culturing. No hybridization is detected in p9.6.2, as this construct lacks the 5′-most 224 bp of the UTR. C. Southern blot of EcoRI-digested C9.1 line gDNA probed sequentially with oNts BAK-73, -74, -75 and -76 under stringent conditions. A significant proportion of the reactive fragments appears to be recognized by all the probes.
A high degree of organizational and sequence similarity exists among the IRs from the various cloned loci above and that of the LAT (Fig. 6). Two highly conserved, quasipalindromic segments, separated by an average of 110 nucleotides, are located ≈ 65 bp 5′ to the start codons of the genes in each pair (5, 6). Embedded within each of these segments is an inverted repeat region (Fig. 6, thick and thin lines). A significant feature of ves IRs is that, although quasipalindromic, they possess a highly conserved asymmetry (Fig. 6). That is, sequences located close to one side of the IR are quite distinct from those found on the opposite end. Importantly, this asymmetry is not dependent upon the ves family branch to which the flanking genes belong, as it is maintained in IRs separating α/β (as in the LAT), α/α (as in cosmid S621), or β/β (data not shown) ves gene pairs. The importance of this curious ‘sidedness’ is not yet clear, but its conservation implies significance.
Discussion
The actively transcribed C9.1ves1α gene was previously identified and confirmed to encode the 128 kDa 1a subunit of the VESA1 antigen expressed on the surface of erythrocytes infected with the C9.1 clonal line (Allred et al., 2000). In the present study, the locus from which C9.1ves1α is transcribed was investigated in detail. Further, with the aim of testing our previous hypothesis regarding the involvement of segmental gene conversion in B. bovis antigenic variation (Allred et al., 2000), and to begin investigating the mechanism(s) controlling ves gene expression, we also identified and characterized genomic loci from which unique segments of sequences were duplicated into the active C9.1ves1α gene. Three distinct, full-length ves1α genes were identified, and each confirmed to have donated at least one small, unique segment of sequences to the LAT-associated C9.1ves1α gene. As one of these loci localized to chromosome 3, and additional sequences were shown to be duplicated from chromosomes 1 and 2, we conclude that ves-associated antigenic variation in the asexual erythrocytic stages occurs, at least in part, through segmental gene conversion. As the LAT is found on chromosome 1, we also conclude that these gene conversion events can occur both inter- and intrachromosomally.
Our results revealed that the LAT-resident C9.1ves1α gene is very closely associated with another novel, yet related gene, which we have named C9.1ves1β. C9.1ves1β clearly represents a subfamily of ves genes, whose multi-intronic gene structure drastically contrasts with the 5′-proximal, two-intron structure characteristic of ves1α genes. Importantly, despite sharing at the polypeptide level several structural domains characteristic of ves gene products, as well as many shorter ves signature sequences, members within each subfamily may share additional distinct structural domains and multiple, short, characteristic sequence segments that are not shared with members of the other subfamily (Fig. 2 and Fig. S3). These gross structural and fine sequence similarities between ves1α and ves1β, but stark difference in exon/intron structure raise the intriguing possibility that the ves1α branch may have been derived from an ancestral ves1β gene transcript through its reverse transcription and integration into the genome as a retrogene (Bannert and Kurth, 2004; Buzdin, 2004). The feasibility of this scenario in the evolution of the ves multigene family may be clarified somewhat upon acquisition of the full B. bovis genome.
Significantly, our data show that the LAT-resident C9.1ves1β gene, like C9.1ves1α, is actively transcribed by the majority population of the C9.1 parasite line, and at comparable levels. This conclusion is supported by at least two observations: (i) the nearly full-length ves1β cDNA, clone 5-1.B73, along with several other C9.1ves1β cDNAs varying only slightly from one another and clone 5-1.B73 (corresponding polypeptides are shown in Fig. S3) were retrieved from the same C9.1 cDNA library from which the ves1α p9.6.2 cDNA had been derived (Allred et al., 2000). These ves1β cDNAs were recovered at a frequency similar to that of p9.6.2-like ves1α cDNAs, i.e. about one in 1000–2000 pfu; (ii) on gDNA Southern blots, all oNt sequences derived from clone 5-1.B73 hybridize with equivalent signal intensities to both the duplicated, transcription-associated gene fragments and size-stable, common donor genes (Fig. 1A, BAK80 and BAK111 panels; data not shown). These analyses also indicate that, like the ves1α gene, the LAT-associated C9.1ves1β gene is undergoing rapid, progressive sequence modification in situ, most likely also through segmental gene conversion.
A notable feature of the C9.1 LAT is the close apposition of two actively transcribed genes separated by a narrow IR of only 433 bp. When the ves1α 5′UTR was analysed by 5′RACE (Allred et al., 2000), the apparent transcription initiation site mapped to what would be the first nucleotide 3′ of the upstream-most inverted repeat segment in the C9.1ves1β gene (6-1 sequence in Fig. 6, asterisk). If this situation were to also hold for C9.1ves1β, the 5′UTRs of the apposing genes would overlap by ≈ 110 bp, suggesting a potential mechanism for co-ordinate control of their transcription. However, no obvious promoter sequences for the individual genes are detected in this region, suggesting that the respective regulatory elements may be located further upstream. Alternatively, a non-conventional, bidirectional promoter may exist in the IR. Intriguingly, preliminary Western blot analyses utilizing an antiserum against a highly conserved ves1β peptide sequence indicate that the ves1β product is also actively expressed (Y.-P. Xiao, B. Al-Khedery, and D.R. Allred, unpubl. data). Further, the overall deduced aa composition of the C9.1ves1β-encoded polypeptide mimics that of biochemically established values for the gel-purified, 113 kDa, C9.1 VESA1b subunit (O’Connor et al., 1997). Together with all existing evidence for the parallel expression and antigenic variation of the VESA1a and -1b subunits at the surface of B. bovis-infected erythrocytes (Allred et al., 1994), these observations lead us to speculate that the ves1β-encoded polypeptide corresponds to the VESA1b subunit. As the VESA1 holoprotein is a heterodimer of both 1a and 1b subunits, this tightly divergent association of the respective genes could help to ensure regulated co-transcription of these genes, and perhaps subunit stoichiometry and developmental timing. Alternatively, this organization may have evolved to facilitate extensive segmental gene conversion, perhaps through the localized adoption of higher order structure, resulting in enhanced strand invasion susceptibility. Considering the large number of potential sequence donors and the variable lengths of donated sequences, the number of possible novel forms of ves genes that could be created would be enormous. These possibilities are currently being further investigated.
With one exception, all ves genes identified so far, including the experimentally verified donor genes, have been found paired tightly head-to-head, similar to the LAT. Additionally, our analyses with gene type- and IR-specific oNt probes suggest that the number of ves1α sequences in the genome could be approximately four times that of ves1α with a significant proportion of these genes organized as divergent pairs. Although genes of the same type may be paired, there appears to be a higher prevalence of α/β gene pairs. This finding may be coincidental, however, arising from a potential bias of the low-abundance BAK6 oNt used here to recover the majority of these loci. Regardless, from the data presented here, it can be concluded that at least some donor ves genes are tightly adjoined, similarly to the transcriptionally active, sequence recipient variant genes, and that such gene pairs do not always possess the same gene-type composition as the LAT. This being said, it is not yet clear if genes can be duplicated in toto; if gene conversion events are limited to head-to-head arranged ves genes; or whether all ves genes, including pseudogenes, can participate in sequence donations to an active LAT, irrespective of their chromosome location. Segmental gene conversion from pseudogenes is known to contribute to the generation of novel, variant genes in several microbial systems, including Trypanosoma equiperdum and Trypanosoma brucei (Roth et al., 1989; Kamper and Barbet, 1992). The prevalence of ves pseudogenes in the B. bovis genome and the extent of their contribution to antigenic variation is not yet evident. Although no sequences unique to the ψ-ves1βD pseudogene (Fig. 4A) were detectable in the LAT-associated C9.1ves1β gene, this does not necessarily rule out its participation in the creation of the active gene. Within members of both gene families, several highly conserved sequence segments are present, whose origins cannot be traced. Furthermore, due to the excessively mosaic nature of ves genes, even minimally conserved sequences can be shared among several genes.
Our investigations at the individual chromosome level establish that small segments of sequences are copied into both of the LAT-associated genes from structurally stable ves loci located on the same or different chromosomes. The exact location of the C9.1 LAT on chromosome 1 has not yet been established. However, the presence of three different arrays of imperfect tandem repeats downstream of the C9.1ves1α gene (Fig. 1C) is suggestive of a subtelomeric location, as such elements are often found near the telomeres of many protozoal parasites, including P. falciparum (reviewed in Scherf et al., 2001; Wickstead et al., 2004). Should these similarly represent subtelomeric tandem repeats in B. bovis, the position of the LAT would be consistent with the predominantly transcribed var genes of P. falciparum (Hernandez-Rivas et al., 1997) and the VSG expression sites of Trypanosoma spp. (Rudenko et al., 1998). Due to our current lack of information regarding transcriptionally active ves loci in the other isogenic clonal lines, it cannot be established at this time if the here described LAT represents a specialized site to which ves gene transcription is limited. If so, it may be only one of several such sites, as is the case with the ≈ 20 potential, telomere-associated VSG expression sites in T. brucei (reviewed in Borst and Ulbert, 2001). The presence of the gene type-specific blocks α1/α2 and β1/β2 at the respective ends of the LAT is a curious observation, likely reflective of unusual outcomes of prior recombination events. The detection of such blocks downstream of two other distinct ves1β gene fragments (data not shown), but a failure to find such sequences associated with any of the well-characterized donor loci, may indicate the existence of additional specialized ves transcription sites in the B. bovis genome. On the other hand, conservation of the divergent gene organization and IR structure between the LAT and other ves loci may imply that many other loci possess equivalent potential to serve in situ as sites of ves transcription. We therefore hypothesize that, with some low frequency, epigenetic in situ switches to other ves gene pairs are possible. Should such switches occur, the newly activated genes would become the LAT and the currently active locus presumably would be silenced (Fig. 7). To date there is no direct evidence for or against this possibility.

Proposed mechanisms for the creation and expression of novel ves genes. We hypothesize that several mechanisms may operate in B. bovis antigenic variation. Segmental gene conversion (A) is demonstrated in this study; in situ switching (B) is inferred from the organization of the ves multigene family. A. The sequence of an actively expressed α/βves gene pair may be progressively altered by the reiterative replacement of sequence segments with novel counterparts from divergent ves donor genes during several gene conversion events. This likely occurs during successive rounds of DNA replication. The donor loci, which themselves remain unaltered, may reside on the same or different (indicated by double slashes) chromosomes as the active LAT. For clarity, only interactions of the ves1α gene in the active LAT are shown. B. Occasionally, in situ switching to the expression of sequences at a different locus might also occur, under epigenetic control. In this scenario, the currently active LAT would be silenced, and a locus containing another gene pair activated, becoming a new LAT. The new locus could correspond to any α/βves gene pair, or it might be an alternative specialized ‘expression site’.
Clearly, there is much to be learned about the various interacting mechanisms that lead to the rearrangements and transcriptional control of the ves multigene family in B. bovis. From the studies presented here we now know that inter- and intrachromosomal segmental gene conversion is, at least in part, responsible for the expression of novel VESA1a and perhaps VESA1b polypeptides. Even though structural and functional constraints may practically limit the isoforms expressed by viable parasites, the VESA1 repertoire is undoubtedly very great, facilitating the persistence of B. bovis for many years in the immune host.
Experimental procedures
Parasites
A description of the derivations of the isogenic, antigenically variant, clonal B. bovis lines used in this paper is provided in Fig. S1. All lines were propagated in vitro as described (Allred et al., 1994). For gDNA isolation and chromosome plug preparation, culture parasitemias were increased several fold by ‘dilution-enrichment’, as described (O’Connor et al., 1997). Bovine leucocytes were removed from the cultures essentially as described (Ambrosio et al., 1986), except that the blood was not lysed prior to passage through Whatman CF-11 cellulose (Whatman) columns.
Oligonucleotides
The sequences of oNts used in this study are provided in Table S1.
Southern blot analysis
Genomic DNA was isolated essentially as described (Tripp et al., 1989); whenever possible, the final DNA was recovered by spooling. Restriction digested gDNAs (1–2 µg) and plasmid DNAs (0.5–1.0 ng) were fractionated on 0.7% or 0.8% agarose gels, denatured, and neutralized following standard procedures (Sambrook and Russell, 2001) prior to transfer to Hybond N+ membranes (Amersham Pharmacia). For transfer of intact chromosomes, PFGE gels were stained with 0.5 µg ml−1 ethidium bromide, destained for 1.5 h, and the DNA was nicked by UV-exposure for 3 min. Transfer of denatured and neutralized DNA was carried out in 20× SSC for 36–42 h. ONts were end-labelled with γ-[32P]ATP as described (Reddy et al., 1991). Prehybridization, hybridization and washes were conducted as described previously (Al-Khedery et al., 1999).
Analysis of B. bovis chromosomes
Chromosome blocks were prepared essentially as described (Ray et al., 1992), at a final concentration of 2.5 × 108 infected erythrocytes per millilitre. Prior to their use in PFGE, blocks were equilibrated in 1× TAE. Intact chromosomes were separated by PFGE using a CHEF-DRII gel apparatus (Bio-Rad). Agarose plugs from B. bovis and the yeast chromosome size markers, Hansenula wingei (1.05–3.13 Mb) and Saccharomyces cerevisiae (0.225–2.2 Mb), were embedded into a 14.5 cm × 19.0 cm, 0.7% chromosomal grade agarose (Bio-Rad) gel. Electrophoresis was carried out in 1× TAE at 14°C in three steps, under the following conditions: 1.5 V cm−1 with a pulse ramp of 20–50 min, for 72 h; 2.4 V cm−1 with a pulse ramp of 10–20 min, for 48 h; and 4.2 V cm−1 with 5 min pulses, for 24 h.
Construction and screening of B. bovis C9.1 libraries
The C9.1 λTriplEx cDNA library used in this study has previously been described (Allred et al., 2000). Additionally, two cosmid libraries were constructed of B. bovis C9.1 line gDNA in the SuperCos1 vector, and four libraries were constructed of size-selected, EcoRI-digested B. bovis C9.1 line gDNA, in λZapII. The details of library construction, and screening with oNt probes is provided in Appendix S1.
DNA sequencing and analysis
Most constructs were sequenced by primer walking. Reactions were analysed by the University of Florida Interdisciplinary Center for Biotechnology Research DNA Sequencing Core Laboratory. To obtain the sequence within the repetitive elements downstream of the C9.1ves1β gene, a transposable element was inserted at random into phagemid 3-2-1, using the EZ::TN<KAN-2>Insertion kit (Epicentre Biotechnologies), and transformed into E. coli Stbl2 (Invitrogen). Clones were screened for single insertions and transposon location by restriction mapping, then sequenced with forward and reverse transposon-specific primers. Sequence analysis was carried out using Genetics Computer Group (GCG) Wisconsin Package 10.3 software (Devereux et al., 1984), SeqWeb v2.1 (Accelrys) and TMHMM (Sonnhammer et al., 1998). Multiple sequence alignments were generated and manually optimized using MACAW v2.0.5. (Greg Schuler, National Center for Biotechnology Information).
Nucleotide sequence accession numbers
Sequences reported in this manuscript have been deposited in the GenBank database with the following accession numbers: AY279553, cosmid 1E10; AY279554, cosmid S621; AY279555-AY279559, DQ267448-DQ267450, and DQ267457-DQ267459, gDNA phagemids; DQ267447 and DQ267451-DQ267456, cDNA clones; DQ267460, C9.1ves1β gene; DQ267461, C9.1 LAT.
Acknowledgements
The authors thank Julie (Crabtree) Bokkor for construction of the cosmid libraries, and Jennifer Long, Laura Stockman, Lilian Waiboci and Christina East for animal and parasite handling. This work was supported by Grants ♯R01 A1055864 (N.I.A.I.D), ♯97-35204-4768 and 2001-35204-10144 (USDA).