High-throughput and parallel SNP discovery in selected candidate genes in Eucalyptus camaldulensis using Illumina NGS platform
Corresponding Author
Prasad S. Hendre
(fax +91 80 2839 4352; email [email protected])Search for more papers by this authorR. Kamalakannan
ITC R&D Centre, Peenya Industrial Area, Bangalore, Karnataka, India
Search for more papers by this authorMohan Varghese
ITC R&D Centre, Peenya Industrial Area, Bangalore, Karnataka, India
Search for more papers by this authorCorresponding Author
Prasad S. Hendre
(fax +91 80 2839 4352; email [email protected])Search for more papers by this authorR. Kamalakannan
ITC R&D Centre, Peenya Industrial Area, Bangalore, Karnataka, India
Search for more papers by this authorMohan Varghese
ITC R&D Centre, Peenya Industrial Area, Bangalore, Karnataka, India
Search for more papers by this authorSummary
Next generation sequencing (NGS) technologies have revolutionized the pace and scale of genomics- and transcriptomics-based SNP discovery across different plant and animal species. Herein, 72-base paired-end Illumina sequencing was employed for high-throughput, parallel and large-scale SNP discovery in 41 growth-related candidate genes in Eucalyptus camaldulensis. Approximately 100 kb of genome from 96 individuals was amplified and sequenced using a hierarchical DNA/PCR pooling strategy and assembled over corresponding E. grandis reference. A total of 1191 SNPs (minimum 5% other allele frequency) were identified with an average frequency of 1 SNP/83.9 bp, whereas in exons and introns, it was 1 SNP/108.4 bp and 1 SNP/65.6 bp, respectively. A total of 75 insertions and 89 deletions were detected of which approximately 15% were exonic. Transitions (Tr) were in excess than transversions (Tv) (Tr/Tv: 1.89), but exceeded in exons (Tr/Tv: 2.73). In exons, synonymous SNPs (Ka) prevailed over the non-synonymous SNPs (Ks; average Ka/Ks ratio: 0.72, range: 0–3.00 across genes). Many of the exonic SNPs/indels had potential to change amino acid sequence of respective genes. Transcription factors appeared more conserved, whereas enzyme coding genes appeared under relaxed control. Further, 541 SNPs were classified into 196 ‘equal frequency’ (EF) blocks with almost similar minor allele frequencies to facilitate selection of one tag-SNP/EF-block. There were 241 (approximately 20%) ‘zero-SNP’ blocks with absence of SNPs in surrounding ±60 bp windows. The data thus indicated enormous extant and unexplored diversity in E. camaldulensis in the studied genes with potential applications for marker-trait associations.
Supporting Information
Table S1 Details of genes and reference regions used for SNP discovery.
Table S2 Distribution of discovered SNPs in different genic compartments and their respective frequencies.
Filename | Description |
---|---|
PBI_699_sm_TableS1-S2.doc271 KB | Supporting info item |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- Alonso-Blanco, C., Mendez-Vigo, B. and Koornneef, M. (2005) From phenotypic to molecular polymorphisms involved in naturally occurring variation of plant development. Int. J. Dev. Biol. 49, 717–732.
- Bennett, B.M. (2010) The El Dorado of forestry: the Eucalyptus in India, South Africa, and Thailand, 1850–2000. Int. Rev. Soc. Hist. 55, 27–50.
- Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., Boutell, J.M., Bryant, J., Carter, R.J., Keira Cheetham, R., Cox, A.J., Ellis, D.J., Flatbush, M.R., Gormley, N.A., Humphray, S.J., Irving, L.J., Karbelashvili, M.S., Kirk, S.M., Li, H., Liu, X., Maisinger, K.S., Murray, L.J., Obradovic, B., Ost, T., Parkinson, M.L., Pratt, M.R., Rasolonjatovo, I.M.J., Reed, M.T., Rigatti, R., Rodighiero, C., Ross, M.T., Sabot, A., Sankar, S.V., Scally, A., Schroth, G.P., Smith, M.E., Smith, V.P., Spiridou, A., Torrance, P.E., Tzonev, S.S., Vermaas, E.H., Walter, K., Wu, X., Zhang, L., Alam, M.D., Anastasi, C., Aniebo, I.C., Bailey, D.M.D., Bancarz, I.R., Banerjee, S., Barbour, S.G., Baybayan, P.A., Benoit, V.A., Benson, K.F., Bevis, C., Black, P.J., Boodhun, A., Brennan, J.S., Bridgham, J.A., Brown, R.C., Brown, A.A., Buermann, D.H., Bundu, A.A., Burrows, J.C., Carter, N.P., Castillo, N., Catenazzi, M.C.E., Chang, S., Cooley, R.N., Crake, N.R., Dada, O.O., Diakoumakos, K.D., Dominguez-Fernandez, B., Earnshaw, D.J., Egbujor, U.C., Elmore, D.W., Etchin, S.S., Ewan, M.R., Fedurco, M., Fraser, L.J., Fajardo, K.V.F., Furey, W.S., George, D., Gietzen, K.J., Goddard, C.P., Golda, G.S., Granieri, P.A., Green, D.E., Gustafson, D.L., Hansen, N.F., Harnish, K., Haudenschild, C.D., Heyer, N.I., Hims, M.M., Ho, J.T., Horgan, A.M., Hoschler, K., Hurwitz, S., Ivanov, D.V., Johnson, M.Q., James, T., Jones, T.A.H., Kang, G.-D., Kerelska, T.H., Kersey, A.D., Khrebtukova, I., Kindwall, A.P., Kingsbury, Z., Kokko-Gonzales, P.I., Kumar, A., sLaurent, M.A., Lawley, C.T., Lee, S.E., Lee, X., Liao, A.K., Loch, J.A., Lok, M., Luo, S., Mammen, R.M., Martin, J.W., McCauley, P.G., McNitt, P., Mehta, P., Moon, K.W., Mullens, J.W., Newington, T., Ning, Z., Ling Ng, B., Novo, S.M., O’Neill, M.J., Osborne, M.A., Osnowski, A., Ostadan, O., Paraschos, L.L., Pickering, L., Pike, A.C., Pike, A.C., Chris Pinkard, D., Pliskin, D.P., Podhasky, J., Quijano, V.J., Raczy, C., Rae, V.H., Rawlings, S.R., Rodriguez, A.C., Roe, P.M., Rogers, J., Bacigalupo, M.C.R., Romanov, N., Romieu, A., Roth, R.K., Rourke, N.J., Ruediger, S.T., Rusman, E., Sanches-Kuiper, R.M., Schenker, M.R., Seoane, J.M., Shaw, R.J., Shiver, M.K., Short, S.W., Sizto, N.L., Sluis, J.P., Smith, M.A., Sohna, J.E.S., Spence, E.J., Stevens, K., Sutton, N., Szajkowski, L., Tregidgo, C.L., Turcatti, G., vandeVondele, S., Verhovsky, Y., Virk, S.M., Wakelin, S., Walcott, G.C., Wang, J., Worsley, G.J., Yan, J., Yau, L., Zuerlein, M., Rogers, J., Mullikin, J.C., Hurles, M.E., McCooke, N.J., West, J.S., Oaks, F.L., Lundberg, P.L., Klenerman, D., Durbin, R. and Smith, A.J. (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature, 456, 53–59.
- Bhat, P.R., Sondur, S.N., Awati, M.G. and Udayakumar, M. (2002) An efficient method of DNA extraction from dried leaf samples of coffee. J. Plantation Crops, 30, 13–15.
- Bundock, P.C., Eliott, F.G., Ablett, G., Benson, A.D., Casu, R.E., Aitken, K.S. and Henry, R.J. (2009) Targeted single nucleotide polymorphism (SNP) discovery in a highly polyploid plant species using 454 sequencing. Plant Biotechnol. J. 7, 347–354.
- Busov, V.B., Brunner, A.M. and Strauss, S.H. (2008) Genes for control of plant stature and form. New Phytol. 177, 589–607.
- Butcher, P.A., Otero, A., McDonald, M.W. and Moran, G.F. (2002) Nuclear RFLP variation in Eucalyptus camaldulensis Dehnh. from northern Australia. Heredity, 88, 402–412.
- Chen, W., Provart, N.J., Glazebrook, J., Katagiri, F., Chang, H.-S., Eulgem, T., Mauch, F., Luan, S., Zou, G., Whitham, S.A., Budworth, P.R., Tao, Y., Xie, Z., Chen, X., Lam, S., Kreps, J.A., Harper, J.F., Si-Ammour, A., Mauch-Man, B., Heinlein, M., Kobayashi, K., Hohn, T., Dangl, J.L., Wang, X. and Zhu, T. (2002) Expression profile matrix of arabidopsis transcription factor genes suggests their putative functions in response to environmental stresses. Plant Cell, 14, 559–574.
- Ching, A., Caldwell, K.S., Jung, M., Dolan, M., Smith, O.S., Tingey, S., Morgante, M. and Rafalski, A.J. (2002) SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC Genet. 3, 19.
- Cronk, Q.C. (2001) Plant evolution and development in a post-genomic context. Nat. Rev. Genet. 2, 607–619.
- Dagan, T., Talmor, Y. and Grau, D. (2002) Ratios of radical to conservative amino acid replacement are affected by mutational and compositional factors and may not be indicative of positive Darwinian selection. Mol. Biol. Evol. 19, 1022–1102.
- Dantec, L.L., Chagne, D., Pot, D., Cantin, O., Garnier-Gere, P., Frank Bedon, F., Frigerio, J.M., Chaumeil, P., Leger, P., Garcia, V., Laigret, F., de Daruvar, A. and Plomion, C. (2004) Automated SNP detection in expressed sequence tags: statistical considerations and application to maritime pine sequences. Plant Mol. Biol. 54, 461–470.
- Doebley, J.F., Gaut, B.S. and Smith, B.D. (2006) The molecular genetics of crop domestication. Cell, 127, 1309–1321.
- Doran, J.C. and Burgess, I.P. (1993) Variation in floral bud morphology in the intergrading zones of Eucalyptus camaldulensis and E. tereticornis in Northern Queensland. Comm. For. Rev. 72, 198–202.
- Duncan, B.K. and Miller, J.H. (1980) Mutagenic deamination of cytosine residues in DNA. Nature, 287, 560–561.
- Eberle, M.A., Rieder, M.J., Kruglyak, L. and Nickerson, D.A. (2006) Allele frequency matching between SNPs reveals an excess of linkage disequilibrium in genic regions of the human genome. PLoS Genet. 2, e142.
- Feltus, F.A., Wan, J., Schulz, S.R., Estill, J.C., Jiang, N. and Paterson, A.H. (2004) An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments. Genome Res. 14, 1812–1819.
- Freudenberg-Hua, Y., Freudenberg, J., Kluck, N., Cichon, S., Propping, P. and Nöthen, M.M. (2003) Single nucleotide variation analysis in 65 candidate genes for CNS disorders in a representative sample of the European population. Genome Res. 13, 2271–2276.
- Gonzalez-Martinez, S.C., Wheeler, N.C., Ersoz, E., Nelson, C.D. and Neale, D.B. (2007) Association genetics in Pinus taeda L. I. wood property traits. Genetics, 175, 399–409.
- Grattapaglia, D. and Kirst, M. (2008) Eucalyptus applied genomics: from gene sequences to breeding tools. New Phytol. 179, 911–929.
- Grattapaglia, D., Silva-Junior, O.B., Kirst, M., de Lima, B.M., Faria, D.A. and Pappas-Jr, G.J. (2011) High-throughput SNP genotyping in the highly heterozygous genome of Eucalyptus: assay success, polymorphism and transferability across species. BMC Plant Biol. 11, 65.
- Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98.
- Harismendy, O., Ng, P.C., Strausberg, R.L., Wang, X., Stockwell, T.B., Beeson, K.Y., Schork, N.J., Murray, S.S., Topol, E.J., Levy, S. and Frazer, K.A. (2009) Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol. 10, R32.
- Heard, E., Tishkoff, S., Todd, J.A., Vidal, M., Wagner, G.P., Wang, J., Weigel, D. and Young, R. (2010) Ten years of genetics and genomics: what have we achieved and where are we heading? Nat. Rev. Genet. 11, 723–733.
- Hinds, D.A., Stuve, L.L., Nilsen, G.B., Halperin, E., Eskin, E., Ballinger, D.G., Frazer, K.A. and Cox, D.R. (2005) Whole genome patterns of common DNA variation in three human populations. Science, 307, 1072–1079.
- Hyten, D.L., Cannon, S.B., Song, Q., Weeks, N., Fickus, E.W., Shoemaker, R.C., Specht, J.E., Farmer, A.D., May, G.D. and Cregan, P.B. (2010) High-throughput SNP discovery through deep resequencing of a reduced representation library to anchor and orient scaffolds in the soybean whole genome sequence. BMC Genomics, 11, 38.
- Jiang, Z., Wu, X.-L., Zhang, M., Michal, J.J. and Wright-Jr, R.W. (2008) The complementary neighborhood patterns and methylation-to-mutation likelihood Structures of 15,110 single-nucleotide polymorphisms in the bovine genome. Genetics, 180, 639–647.
- Kharabian-Masouleh, A., Waters, D.L.E., Reinke, R.F. and Henry, R.J. (2011) Discovery of polymorphisms in starch-related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. Plant Biotechnol. J. 9, 1074–1085.
- Kiełbasa, S.M. and Vingron, M. (2008) Transcriptional autoregulatory loops are highly conserved in vertebrate evolution. PLoS One, 3, e3210.
- Kim, G.-T. and Cho, K.-H. (2006) Recent advances in the genetic regulation of the shape of simple leaves. Physiol. Plant. 126, 494–502.
- Kim, S., Plagnol, V., Hu, T.T., Toomajian, C., Clark, R.M., Ossowski, S., Ecker, J.R., Weigel, D. and Nordborg, M. (2007) Recombination and linkage disequilibrium in Arabidopsis thaliana. Nat. Genet. 39, 1151–1155.
- Kulheim, C., Yeoh, S.H., Maintz, J., Foley, W.J. and Moran, G.F. (2009) Comparative SNP diversity among four Eucalyptus species for genes from secondary metabolite biosynthetic pathways. BMC Genomics, 10, 452.
- Lepoittevin, C., Frigerio, J.-M., Garnier-Gere, P., Salin, F., Cervera, M.-T., Vornam, B., Harvengt, L. and Plomion, C. (2010) In vitro vs in silico detected SNPs for the development of a genotyping array: what can we learn from a non-model species? PLoS One, 5, e11034.
- Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics, 25, 1754–1760.
- Li, Y., Grit Haseneyer, G., Chris-Carolin Schön, C.-C., Ankerst, D., Korzun, V., Wilde, P. and Baue, E. (2011) High levels of nucleotide diversity and fast decline of linkage disequilibrium in rye (Secale cereale L.) genes involved in frost response. BMC Plant Biol. 11, 6.
- Lijavetzky, D., Cabezas, J.A., Ibáñez, A., Rodríguez, V. and Martínez-Zapater, J.M. (2007) High throughput SNP discovery and genotyping in grapevine (Vitis vinifera L.) by combining a re-sequencing approach and SNPlex technology. BMC Genomics, 8, 424.
- Morrell, P.L., Toleno, D.M., Lundy, K.E. and Clegg, M.T. (2005) Low levels of linkage disequilibrium in wild barley (Hordeum vulgare ssp. spontaneum) despite high rates of self-fertilization. Proc. Natl Acad. Sci. USA, 102, 2442–2447.
- Muchero, W., Diop, N.N., Bhat, P.R., Fenton, R.D., Wanamaker, S., Pottorff, M., Hearne, S., Cisse, N., Fatokun, C., Ehlers, J.D., Roberts, P.A. and Close, T.J. (2009) A consensus genetic map of cowpea [Vigna unguiculata (L) Walp.] and synteny based on EST-derived SNPs. Proc. Natl Acad. Sci. USA, 106, 18159–18164.
- Neale, D. (2007) Genomics to tree breeding and forest health. Curr. Opin. Genet. Dev. 17, 539–544.
- Novaes, E., Drost, D.R., Farmerie, W.G., Pappas-Jr, G.J., Grattapaglia, D., Sederoff, R.R. and Kirst, M. (2008) High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics, 9, 312.
- Paszkiewicz, K. and Studholme, D.J. (2012) High-throughput sequencing data analysis software: current state and future developments. In Bioinformatics for high throughput sequencing ( N. Rodriguez-Ezpeleta, M. Hackenberg and A.M. Aransay, eds), pp. 231–248, New York: Springer Science.
- Pavy, N., Parsons, L.S., Paule, C., MacKay, J. and Bousquet, J. (2006) Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for the categorization of SNPs. BMC Genomics, 7, 174.
- Piazza, P., Jasinski, S. and Tsiantis, M. (2005) Evolution of leaf developmental mechanisms. New Phytol. 167, 693–710.
- Porebski, S., Bailey, L.G. and Baum, B.R. (1997) Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol. Biol. Rep. 15, 8–15.
- Ravel, C., Praud, S., Murigneux, A., Canaguier, A., Sapet, F., Samson, D., Balfourier, F., Dufour, P., Chalhoub, B., Brunel, D., Beckert, M. and Charmet, G. (2006) Single-nucleotide polymorphism frequency in a set of selected lines of bread wheat (Triticum aestivum L.). Genome, 49, 1131–1139.
- Rosenberg, M.S., Subramanian, S. and Kumar, S. (2003) Patterns of transitional mutation biases within and among mammalian genomes. Mol. Biol. Evol. 20, 988–993.
- Schmid, K.J., Sorensen, T.R., Stracke, R., Torjek, O., Altmann, T., Mitchell-Olds, T. and Weisshaar, B. (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res. 13, 1250–1257.
- Schneeberger, K. and Weigel, D. (2011) Fast-forward genetics enabled by new sequencing technologies. Trends Plant Sci. 16, 282–288.
- Sexton, T.R., Henry, R.J., McManus, L.J., Henson, M., Thomas, D.S. and Shepherd, M. (2010) Genetic association studies in Eucalyptus pilularis Smith (blackbutt). Aust. For. J. 73, 254–258.
- Simko, I., Haynes, K.G. and Jones, R.W. (2006) Assessment of linkage disequilibrium in potato genome with single nucleotide polymorphism markers. Genetics, 173, 2237–2245.
- Singh, T.R., Gupta, A., Riju, A., Mahalaxmi, M., Seal, A. and Arunachalam, V. (2011) Computational identification and analysis of single nucleotide polymorphisms and insertions/deletions in expressed sequence tag data of Eucalyptus. J. Genet. 90, e34–e38.
- Thumma, B.R., Nolan, M.F., Evans, R. and Moran, G.F. (2005) Polymorphisms in cinnamoyl coA reductase (CCR) are associated with variation in microfibril angle in Eucalyptus spp. Genetics, 171, 1257–1265.
- Thumma, B.R., Matheson, B.A., Zhang, D., Meeske, C., Meder, R., Downes, G.M. and Southerton, S.G. (2009) Identification of a cis-acting regulatory polymorphism in a eucalypt COBRA-like gene affecting cellulose content. Genetics, 183, 1153–1164.
- Thumma, B.R., Baltunis, B.S., Bell, J.C., Emebiri, L.C., Moran, G.F. and Southerton, S.G. (2010) Quantitative trait locus (QTL) analysis of growth and vegetative propagation traits in Eucalyptus nitens full-sib families. Tree Genet. Genomes, 6, 877–889.
- Van Tassell, C.P., Smith, T.P., Matukumalli, L.K., Taylor, J.F., Schnabel, R.D., Lawley, C.T., Haudenschild, C., Moore, S.S., Warren, W.C. and Sonstegard, T.S. (2008) SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries. Nat. Methods, 5, 247–252.
- Wang, D., Zhang, C., Hearn, D.J., Kang, I.-H., Punwani, J.A., Skaggs, M.I., Drews, G.N., Schumaker, K.S. and Yadegari, R. (2010a) Identification of transcription-factor genes expressed in the Arabidopsis female gametophyte. BMC Plant Biol. 10, 110.
- Wang, J., Gao, X., Li, L., Shi, X., Zhang, J. and Shi, Z. (2010b) Overexpression of Osta-siR2141 caused abnormal polarity establishment and retarded growth in rice. J. Exp. Bot. 61, 1885–1895.
- Wei, Z., Wang, W., Hu, P., Lyon, G.J. and Hakonarson, H. (2011) SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data. Nucleic Acids Res. 39, e132.
- Zhu, Y.L., Song, Q.J., Hyten, D.L., Van Tassell, C.P., Matukumalli, L.K., Grimm, D.R., Hyatt, S.M., Fickus, E.W., Young, N.D. and Cregan, P.B. (2003) Single-nucleotide polymorphisms in soybean. Genetics, 163, 1123–1134.