A Population Genetic Study of the Cryptosporidium parvum Human Genotype Parasites
Two major genotypes of Cryptosporidium parvum have been found to infect humans: (i) the C. parvum bovine genotype (also known as genotype 2) that infects humans, ruminants, and some other mammalian hosts; and (ii) the C. parvum human genotype (also known as genotype 1) that infects primarily humans. Previous studies have suggested the existence of a clonal population structure of C. parvum parasites, based on the observation of a wide distribution of human and bovine genotypes in many regions of the world, and a lack of recombinant genotypes [2,4]. This genetic segregation at the genotype level, however, could represent reproductive segregation at the species level, because the species structure of Cryptosporidium parasite is still unclear [12].
Earlier, we characterized the C. parvum thrombospondin-related protein 2 (TRAP-C2) and small subunit ribosomal RNA (SSU rRNA) genes and one microsatellite sequence, and recognized some genetic variation within the C. parvum human genotype [5,7,10,11]. Recently, a study has shown extensive intra-genotype heterogeneity in a 60-kDa glycoprotein (GP60, also known as gpl5/45/60) gene [6]. In this study, we systematically sequence-characterized 62 C. parvum human genotype isolates at the TRAP-C2, polythreonine protein (Poly-T), SSU rRNA, 70 kDa heat shock protein (HSP70) and GP60 genes, and one microsatellite locus (SSR-1) to validate our previous observations on the existence of multiple subgenotypes and to establish an evolutionary linkage among these genes. The data should be useful to the understanding of the population structure of C. parvum.
MATERIALS AND METHODS
Sixty two samples containing oocysts of the C. parvum human genotype parasites were obtained from infected individuals from different geographic regions, and stored at 4°C in 2.5% potassium dichromate solution (Table 1). Oocysts and DNA were purified as described before [7]. Using published primers, a multi-locus PCR was carried out to amplify the C. parvum human genotype TRAP-C2 [7], Poly-T [3], SSU rRNA [10], and HSP70 [8] genes, and one microsatellite sequence [1]. To amplify the GP60 gene, a two-step nested PCR protocol was developed, using primers complementary to the conserved region of published nucleotide sequences [6]. For primary PCR, a fragment of ∼800 bp was amplified using primers 5′-ATAGTCTCCGCTGTATTC-3′and 5′-GATGCTGGTTCCTCTGC-3′. For secondary PCR, a fragment of ∼500 bp was amplified using 2.5 μl of primary PCR reaction and primers 5′-TCCGCTGTATTCTCAGCC-3′and 5′-GCACCAAGATATAT-3′primers. PCR products were sequenced in both directions on an ABI 3100 automated sequencer using the Big Dye(tm) Terminator (Perkin Elmer, Foster City, California). Sequence accuracy was confirmed by sequencing two separate PCR products of each isolate. Nucleotide sequences obtained from various isolates were aligned using GCG program (Genetics Computers Group, Madison, Wisconsin). Phylogenetic trees were constructed using the TREECON program as described before [10]. The nucleotide sequences of the HSP70 and GP60 gene of various C. parvum human genotype isolates have been deposited in the GenBank database under accession no. AF401503 to AF401507, AF402289 to AF402293, and AF403161 to AF403164.
Location | Year | Source | Sample | ID | SSU rRNA | TRAPC-2 | Poly-T | SSR-1 | HSP70 | GP60 |
---|---|---|---|---|---|---|---|---|---|---|
Milwaukee | 1993 | Waterbome outbreak | 83 | HM5 | A | A | A | A | A | A |
Milwaukee | 1993 | Waterbome outbreak | 95 | HM3 | A | A | A | A | A | A |
Milwaukee | 1993 | Waterbome outbreak | 121 | HM2 | A | A | A | A | A | A |
Milwaukee | 1993 | Waterbome outbreak | 238/344 | HM6 | A | A | A | A | A | |
Gainesville | 1995 | Waterbome outbreak | 118 | HFL6 | D | B | B | B | C | C |
Gainesville | 1995 | Waterbome outbreak | 94 | HFL5 | D | B | B | B | C | C |
Gainesville | 1995 | Waterbome outbreak | 345 | HFL2 | B | B | C | C | ||
Gainesville | 1995 | Waterbome outbreak | 347 | HFL4 | B | C | ||||
Nevada | 1994 | Waterbome outbreak | 119 | HCNV2 | A | A | A | A | A | A |
Nevada | 1994 | Waterbome outbreak | 120 | HCNV4 | A | A | A | A | A | |
Nevada | 1994 | Waterbome outbreak | 348/584 | HCNV3 | A | A | A | |||
GA Water Park | 1995 | Waterbome outbreak | 84 | HGA4 | A | A | A | A | A | |
GA Water Park | 1995 | Waterbome outbreak | 115 | HGA1 | A | A | A | A | A | A |
GA Water Park | 1995 | Waterbome outbreak | 123 | HGA2 | A | A | A | A | ||
Spokane, WA | 1997 | Foodbome outbreak | 179 | HWA2 | C | A | C | A | C | G |
Spokane, WA | 1997 | Foodbome outbreak | 180 | HWA3 | C | A | C | A | C | G |
Spokane, WA | 1997 | Foodbome outbreak | 181 | HWA4 | C | A | C | A | C | G |
Spokane, WA | 1997 | Foodbome outbreak | 182 | HWA5 | C | A | C | A | C | G |
Spokane, WA | 1997 | Foodbome outbreak | 183 | HWA6 | C | A | C | A | C | G |
Spokane, WA | 1997 | Foodbome outbreak | 184 | HWA7 | C | A | C | C | ||
Washington, DC | 1998 | Foodbome outbreak | 500 | HDC1 | A | A | A | A | A | A |
Washington, DC | 1998 | Foodbome outbreak | 501 | HDC2 | A | A | A | A | A | A |
Washington, DC | 1998 | Foodbome outbreak | 502 | HDC4 | A | A | A | A | A | A |
Washington, DC | 1998 | Foodbome outbreak | 503 | HDC5 | A | A | A | A | A | A |
Washington, DC | 1998 | Foodbome outbreak | 504 | HDC6 | A | A | A | A | ||
Washington, DC | 1998 | Foodbome outbreak | 505 | HDC7 | A | A | A | A | A | A |
Washington, DC | 1998 | Foodbome outbreak | 508 | HDC18 | A | A | A | A | A | |
Washington, DC | 1998 | Foodbome outbreak | 509 | HDC20 | A | A | A | A | A | A |
Washington, DC | 1998 | Foodbome outbreak | 506 | HDC12 | A | A | A | A | A | |
Washington, DC | 1998 | Foodbome outbreak | 507 | HDC16 | A | A | A | A | A | |
Washington, DC | 1998 | Foodbome outbreak | 510 | HDC25 | A | A | A | A | A | |
Texas | 1996 | Outbreak | 122 | HCTX8 | B | B | B | E | I | |
New Orleans | 1997 | AIDS | 164 | HN06 | A | A | A | A | A | A |
New Orleans | 1997 | AIDS | 175 | HNO17 | A | A | A | A | A | A |
New Orleans | 1997 | AIDS | 177 | HNO19 | A | A | A | A | A | A |
New Orleans | 1997 | AIDS | 176 | HNO18 | A | B | B | B | E | H |
New Orleans | 1998 | AIDS | 298 | HNO22 | A | B | B | B | E | H |
New Orleans | 1998 | AIDS | 307 | HNO31 | A | B | B | B | E | H |
New Orleans | 1998 | AIDS | 308 | HNO32 | A | B | B | B | E | H |
New Orleans | 1998 | AIDS | 299 | HNO23 | A | B | B | F | F | |
New Orleans | 1997 | AIDS | 171 | HN013 | A | B | B | B | E | H |
New Orleans | 1997 | AIDS | 161 | HNO3 | A | A | A | A | A | |
New Orleans | 1998 | AIDS | 296 | HNO20 | A | B | B | F | F | |
New Orleans | 1998 | AIDS | 301 | HN025 | A | B | B | B | E | H |
New Orleans | 1998 | AIDS | 306 | HN030 | A | A | A | A | ||
New Orleans | 1998 | AIDS | 666 | HNO35 | A | A | A | A | ||
New Orleans | 1998 | AIDS | 669 | HNO37 | B | B | E | H | ||
Atlanta | 1995 | Child | 116 | HGA5 | A | A | A | A | A | A |
Guatemala | 1997 | Child | 129 | HGMO4 | A | A | A | |||
Guatemala | 1997 | Child | 132 | HGMO7 | A | A | A | A | A | |
Guatemala | 1997 | Child | 127 | HGMO2 | A | B | B | B | B | B |
Guatemala | 1997 | Child | 133 | HGMO8 | A | B | B | B | B | |
Guatemala | 1997 | Child | 135 | HGM10 | A | B | B | B | B | |
Guatemala | 1997 | Child | 128 | HGM03 | A | B | B | B | B | |
Guatemala | 1997 | Sporadic case | 134 | HGM09 | A | B | B | B | E | B |
Guatemala | 1997 | Sporadic case | 136 | HGM11 | B | E | ||||
Kenya | 1998 | Sporadic case | 497 | KenFl | B | D | D | |||
Kenya | 1998 | Sporadic case | 498 | KenF4 | B | D | D | |||
Texas | 1997 | Sporadic case | 97 | HCTX | B | B | B | B | E | I |
Australia | 1998 | Sporadic case | 419 | H15 | B | B | B | E | E | |
Australia | 1998 | Sporadic case | 420 | H17 | B | B | B | E | E | |
Australia | 1998 | Sporadic case | 421 | H21 | B | B | B | E | E |
- Blanks indicate subgenotype information were not obtained.
RESULTS AND DISCUSSION
The extent of genetic variation within the human C. parvum isolates was assessed by multiple alignments of obtained nucleotide sequences followed by phylogenetic analyses. Multiple subgenotypes within the 62 isolates of C. parvum human genotype were observed at all six loci (TRAP-C2, Poly-T, SSU rRNA, HSP70, GP60, and the microsatellite loci) (Table 1).
Two distinct subgenotypes of the C. parvum human genotype were noticed in the 369 bp region of the TRAP-C2 gene. As described previously, the two subgenotypes differed (“T” or “C”) only at one position (nt 280) [6,7]. Similarly, two subgenotypes of C. parvum human genotype were also present at the microsatellite locus [1]. However, within the 318 bp region of Poly-T gene the C. parvum human genotype parasites displayed 3 distinct subgenotypes. A point mutation (“A” or “T”) was evident at position 172; subgenotypes B and C had “T” at this position, whereas subgenotype A had “A”. In addition, a deletion of three bp (“CAC”) from nt 206 to nt 208 was also noticed in the nucleotide sequences of subgenotype C isolates.
The C. parvum human genotype parasites within the ∼825 bp region of the SSU rRNA gene exhibited 4 distinct subgenotypes judged by the direct sequencing of PCR products. These subgenotypes differed from each other by the number of T-repeats in the predominant sequence. Subgenotypes A, B, C and D had 11, 10, 8 and 6T in the predominant sequence, respectively. In addition, these subgenotypes also differed from each other by point mutations in the predominant sequence. The accuracy of subgenotyping using the SSU rRNA sequences, however was compromised by the presence of heterogeneous copies of the gene within a single isolate [11]. The latter became more apparent when sequencing was done on cloned PCR product from a single isolate.
Higher genetic polymorphism within the C. parvum human genotype was evident in the HSP70 gene. The human C. parvum genotype isolates at this locus showed 6 distinct subgenotypes. The genetic variations were mainly confined to the 3′region of the gene analyzed (Fig 1).
The GP60 gene exhibited the highest degree of genetic variation. At this locus, the 62 isolates analyzed produced 9 distinct types of nucleotide sequences. To understand the population genetic structure at this locus, we constructed a neighbor-joining tree in a phylogenetic analysis using the aligned human C. parvum genotype GP60 sequences generated from this study and published bovine and human genotype C. parvum sequences [6] downloaded from the GenBank (Fig 2). The 9 subgenotypes were placed in four major distinct clusters (or allele families) of the human C. parvum genotype parasites. The first allele (Ib) consisted of two subgenotypes (A, and C), one of which was a predominant subgenotype consisting of 28 isolates in this study. The second (Ie) and third (Id) alleles also had two subgenotypes each, while the fourth allele had three distinct subgenotypes (D, H, and I). One of the allele families previously reported by Strong et al. [2000] was not found in this study. However, a new allele family (Ie) of two subgenotypes was seen. All together, five subgenotypes (F, B, E, G, and D in Fig. 2) generated in this study were new subgenotypes. Subgenotypes within alleles largely differed from each other in the number of TCA/TCG repeates. Alleles, however, differed extensively from each other in sequences in the non-repeat region.

Phylogenetic relationships among subgenotypes of the C. parvum human genotype inferred by neighbor-joining analysis of GP60 nucleotide sequences.
A strong linkage disequilibrium was present among subgenotyping results of the six genetic loci. A complete congruence was seen between the separation of two subgenotypes in the TRAP-C2 gene and two subgenotypes at the microsatellite locus, i.e. all isolates of subgenotype A at the TRAP-C2 locus had subgenotype A profile at the microsatellite locus (Table 1). Likewise, among the 3 subgenotypes at the Poly-T locus, the Poly-T gene merely further divides one of the TRAP-C2 and microsatellite subgenotype (A) into two Poly-T subgenotypes. The same was also true for HSP70, which subdivided some of the Poly-T subgenotypes. GP60 further subdivided some HSP70 subgenotypes (Fig. 3). This linkage disequilibrium among genes analyzed in this study supports the theory of a clonal population structure for Cryptosporidium parasites. Previously, the absence of recombinant genotypes and linkage disequilibrium have been used as indicators of clonality of parasitic protozoa [9].

Schematic diagram showing the linkage disequilibrium among five genetic loci studied. Subgenotyping data from SSU rRNA are not included because of the problem of subgenotyping accuracy associated with this target.
Unlike other genes, which constantly separate C. parvum into two sister clusters of the C. parvum human genotype and C. parvum bovine genotype, the GP60 gene failed to divide C. parvum into such monophylatic clusters. Thus some GP60 human genotype alleles (Ib, Ic and Ie in Fig. 2) were more related to the bovine genotype allele. The reason responsible for this polyphylatic nature is not clear. The spread of allelic sequence differences across the entire GP60 gene and the absence of cross-over points do not support the existence of recombination. In contrast, the linkage disequilibrium among genetic loci examined in this study strongly suggests that random mutation rather than recombination plays an important role in the evolution of Cryptosporidium parasites.
In summary, absence of recombination and the linkage disequilibrium in subgenotypes among different genes examined in this study supports a clonal population structure for C. parvum. This needs to be substantiated by subsequent studies involving the use of more polymorphic genes. Information on Cryptosporidium population structure should be useful for the clarification of the Cryptosporidium taxonomy, and in the development of subgenotyping tools. Specifically, a linkage disequilibrium among different gene supports the existence of reproductive and genetic isolation among some Cryptosporidium genotypes. Such a linkage disequilibrium will also reduce the resolution power of subgenotyping using a multi-locus approach.
ACKNOWLEDGMENTS
This work was supported in part by funds from the Food Safety Initiative of the Centers for Disease Control and Prevention. We thank Caryn Bern, Barbara Herwaldt, Anne Moore, Wangechi Gatei, Michael J. Arrowood, and Una Morgan for providing some isolates used in this study.