Targeted re-sequencing of the allohexaploid wheat exome
Corresponding Author
Mark O. Winfield
School of Biological Sciences, University of Bristol, Bristol, UK
(Tel 44 117 331 6770; fax 44 117 925 7374; email [email protected])Search for more papers by this authorPaul A. Wilkinson
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorAlexandra M. Allen
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorGary L. A. Barker
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorJane A. Coghill
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorAmanda Burridge
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorAnthony Hall
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorRachael C. Brenchley
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorRosalinda D’Amore
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorNeil Hall
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorMichael W. Bevan
John Innes Centre, Norwich Research Park, Norwich, UK
Search for more papers by this authorKeith J. Edwards
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorCorresponding Author
Mark O. Winfield
School of Biological Sciences, University of Bristol, Bristol, UK
(Tel 44 117 331 6770; fax 44 117 925 7374; email [email protected])Search for more papers by this authorPaul A. Wilkinson
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorAlexandra M. Allen
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorGary L. A. Barker
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorJane A. Coghill
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorAmanda Burridge
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorAnthony Hall
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorRachael C. Brenchley
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorRosalinda D’Amore
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorNeil Hall
School of Biological Sciences, University of Liverpool, Liverpool, UK
Search for more papers by this authorMichael W. Bevan
John Innes Centre, Norwich Research Park, Norwich, UK
Search for more papers by this authorKeith J. Edwards
School of Biological Sciences, University of Bristol, Bristol, UK
Search for more papers by this authorSummary
Bread wheat, Triticum aestivum, is an allohexaploid composed of the three distinct ancestral genomes, A, B and D. The polyploid nature of the wheat genome together with its large size has limited our ability to generate the significant amount of sequence data required for whole genome studies. Even with the advent of next-generation sequencing technology, it is still relatively expensive to generate whole genome sequences for more than a few wheat genomes at any one time. To overcome this problem, we have developed a targeted-capture re-sequencing protocol based upon NimbleGen array technology to capture and characterize 56.5 Mb of genomic DNA with sequence similarity to over 100 000 transcripts from eight different UK allohexaploid wheat varieties. Using this procedure in conjunction with a carefully designed bioinformatic procedure, we have identified more than 500 000 putative single-nucleotide polymorphisms (SNPs). While 80% of these were variants between the homoeologous genomes, A, B and D, a significant number (20%) were putative varietal SNPs between the eight varieties studied. A small number of these latter polymorphisms were experimentally validated using KASPar technology and 94% proved to be genuine. The procedures described here to sequence a large proportion of the wheat genome, and the various SNPs identified should be of considerable use to the wider wheat community.
Supporting Information
Figure S1 Fold coverage of the features on the NimbleGen array for the combined sequences of all eight wheat varieties.
Figure S2 The percentage of the four nucleotides A, C, G and T at the 511 439 putative single-nucleotide polymorphism loci for each of the eight wheat varieties studied.
Figure S3 The bars of the graphs represent the average scores for the 400 contigs contained in each of the three bins.
Table S1 All 59 000 contigs and the 511 439 putative single-nucleotide polymorphisms identified in the eight wheat varieties, Alchemy, Avalon, Cadenza, Hereward, Rialto, Robigus, Savannah and Xi19.
Table S2 The 96 varietal single-nucleotide polymorphisms validated against a panel of 23 wheat varieties.
Table S3 The total number of tri-homoeoallelic loci (row 1) and the number of contigs that contained loci with one (row 2) or more (rows 3 and 4) tri-homoeoallelics single-nucleotide polymorphisms.
Table S4 The loci that were tri-homoeoallelic for all eight varieties.
Filename | Description |
---|---|
PBI_713_sm_FigS1.jpg22.8 KB | Supporting info item |
PBI_713_sm_FigS2.jpg23 KB | Supporting info item |
PBI_713_sm_FigS3.jpg23 KB | Supporting info item |
PBI_713_sm_TableS1.csv44 MB | Supporting info item |
PBI_713_sm_TableS2.xlsx43.6 KB | Supporting info item |
PBI_713_sm_TableS3.jpg26.5 KB | Supporting info item |
PBI_713_sm_TableS4.xlsx204.4 KB | Supporting info item |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
References
- Allen, A.M., Barker, G.L., Berry, S.T., Coghill, J.A., Gwilliam, R., Kirby, S., Robinson, P., Brenchley, R.C., D’Amore, R., McKenzie, N., Waite, D., Hall, A., Bevan, M., Hall, N. and Edwards, K.J. (2011) Transcript-specific, single-nucleotide polymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.). Plant Biotechnol. J. 9, 1086–1099.
- Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
- Asan, N.F., Xu, Y., Jiang, H., Tyler-Smith, C., Xue, Y., Jiang, T., Wang, J., Wu, M., Liu, X., Tian, G., Wang, J., Wang, J., Yang, H. and Zhang, X. (2011) Comprehensive comparison of three commercial human whole-exome capture platforms. Genome Biol. 12, R95.
- Barker, G.L. and Edwards, K. (2009) A genome-wide analysis of single nucleotide polymorphism diversity in the world’s major cereal crops. Plant Biotechnol. J. 7, 318–325.
- Biesecker, L.G., Shianna, K.V. and Mullikin, J.C. (2011) Exome sequencing: the expert view. Genome Biol. 12, 128.
- Coulondre, C., Miller, J.H., Farabaugh, P.J. and Gilbert, W. (1978) Molecular basis of base substitution hotspots in Escherichia coli. Nature, 274, 775–780.
- Devos, K.M., Costa de Oliveira, A., Estill, J.C., Jogi, A., Morales, M., Pinheiro, J., San Miguel, P. and Bennetzen, J.l. (2008) Structure and organization of the wheat genome - the number of genes in the hexaploid wheat genome. In Proc 11th Int. Wheat Genet Symp ( R. Appels, R. Esatwood, E. Lagudah, P. Langridge, M. Mackay, L. McKintyre and P. Sharp, eds), pp. 1–5. Sydney: Sydney University Press.
- Dubcovsky, J. and Dvorak, J. (2007) Genome plasticity a key factor in the success of polyploid wheat under domestication. Science, 316, 1862–1866.
- Duran, C., Appleby, N., Vardy, M., Imelfort, M., Edwards, D. and Batley, J. (2009) Single nucleotide polymorphism discovery in barley using autoSNPdb. Plant Biotechnol. J. 7, 326–333.
- Fu, Y., Springer, N.M., Gerhardt, D.J., Ying, K., Yeh, C.T., Wu, W., Swanson-Wagner, R., D’Ascenzo, M., Millard, T., Freeberg, L., Aoyama, N., Kitzman, J., Burgess, D., Richmond, T., Albert, T.J., Barbazuk, W.B., Jeddeloh, J.A. and Schnable, P.S. (2010) Repeat subtraction-mediated sequence capture from a complex genome. Plant J. 62, 898–909.
- Haun, W.J., Hyten, D.L., Xu, W.W., Gerhardt, D.J., Albert, T.J., Richmond, T., Jeddeloh, J.A., Jia, G., Springer, N.M., Vance, C.P. and Stupar, R.M. (2011) The composition and origins of genomic variation among individuals of the soybean reference cultivar Williams 82. Plant Physiol. 155, 645–655.
- Hernandez, P., Martis, M., Dorado, G., Pfeifer, M., Galvez, S., Schaaf, S., Jouve, N., Simkova, H., Valarik, M., Dolezel, J. and Mayer, K.F. (2012) Next-generation sequencing and syntenic integration of flow-sorted arms of wheat chromosome 4A exposes the chromosome structure and gene content. Plant J. 69, 377–386.
- Hoppman-Chaney, N., Peterson, L.M., Klee, E.W., Middha, S., Courteau, L.K. and Ferber, M.J. (2010) Evaluation of oligonucleotide sequence capture arrays and comparison of next-generation sequencing platforms for use in molecular diagnostics. Clin. Chem. 56, 1297–1306.
- Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754–1760.
- Mercer, T.R., Gerhardt, D.J., Dinger, M.E., Crawford, J., Trapnell, C., Jeddeloh, J.A., Mattick, J.S. and Rinn, J.L. (2012) Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat. Biotechnol. 30, 99–104.
- Morgulis, A., Gertz, E.M., Schaffer, A.A. and Agarwala, R. (2006) A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J. Comput. Biol. 13, 1028–1040.
- Ning, Z., Cox, A.J. and Mullikin, J.C. (2001) SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725–1729.
- Paux, E., Roger, D., Badaeva, E., Gay, G., Bernard, M., Sourdille, P. and Feuillet, C. (2006) Characterizing the composition and evolution of homoeologous genomes in hexaploid wheat through BAC-end sequencing on chromosome 3B. Plant J. 48, 463–474.
- Reumers, J., De Rijk, P., Zhao, H., Liekens, A., Smeets, D., Cleary, J., Van Loo, P., Van Den Bossche, M., Catthoor, K., Sabbe, B., Despierre, E., Vergote, I., Hilbush, B., Lambrechts, D. and Del_Favero, J. (2012) Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing. Nat. Biotechnol. 30, 61–68.
- Saintenac, C., Jiang, D. and Akhunov, E. (2011) Targeted analysis of nucleotide and copy number variation by exon capture in allotetraploid wheat genome. Genome Biol. 12, R88.
- Sears, E.R. (1954) The aneuploids of common wheat. Mo. Agr. Exp. Sta. Res. Bull. 572, 1–58.
- Shewry, P.R. (2009) Wheat. J. Exp. Bot. 60, 1537–1553.
- Sulonen, A.M., Ellonen, P., Almusa, H., Lepisto, M., Eldfors, S., Hannula, S., Miettinen, T., Tyynismaa, H., Salo, P., Heckman, C., Joensuu, H., Raivio, T., Suomalainen, A. and Saarela, J. (2011) Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol. 12, R94.
- Swanson-Wagner, R.A., Eichten, S.R., Kumari, S., Tiffin, P., Stein, J.C., Ware, D. and Springer, N.M. (2010) Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. Genome Res. 20, 1689–1699.
- Tokatlidis, I.S., Tsialtas, J.T., Xynias, I.N., Tamoutsidis, E. and Irakli, M. (2004) Variation within bread wheat cultivar for grain yield, protein content, carbon isotope discrimination and ash content. Field Crops Res. 86, 33–42.
- Wang, J.R., Wei, Y.M., Yan, Z.H. and Zheng, Y.L. (2005) Detection of single nucleotide polymorphisms in 24 kDa dimeric alpha-amylase inhibitors from cultivated wheat and its diploid putative progenitors. Biochim. Biophys. Acta, 1723, 309–320.
- Yang, Z. and Yoder, A.D. (1999) Estimation of the transition/transversion rate bias and species sampling. J. Mol. Evol. 48, 274–283.