Volume 15, Issue 5 pp. 1243-1255

Resource Article

Full Access

A survey of genome-wide single nucleotide polymorphisms through genome resequencing in the Périgord black truffle (Tuber melanosporum Vittad.)

Thibaut Payen,

Thibaut Payen

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

These authors contributed equally to this work.Search for more papers by this author

Claude Murat,

Corresponding Author

Claude Murat

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

These authors contributed equally to this work.Correspondence: Claude Murat, Fax: +33 (0)3 83 39 40 69; E-mail: [email protected]Search for more papers by this author

Anaïs Gigant,

Anaïs Gigant

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Emmanuelle Morin,

Emmanuelle Morin

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Stéphane De Mita,

Stéphane De Mita

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Francis Martin,

Francis Martin

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Thibaut Payen,

Thibaut Payen

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

These authors contributed equally to this work.Search for more papers by this author

Claude Murat,

Corresponding Author

Claude Murat

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

These authors contributed equally to this work.Correspondence: Claude Murat, Fax: +33 (0)3 83 39 40 69; E-mail: [email protected]Search for more papers by this author

Anaïs Gigant,

Anaïs Gigant

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Emmanuelle Morin,

Emmanuelle Morin

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Stéphane De Mita,

Stéphane De Mita

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

Francis Martin,

Francis Martin

INRA, Laboratoire d'Excellence ARBRE, UMR1136 Interactions Arbres-Microorganismes, F-54280 Champenoux, France

UMR1136 Interactions Arbres-Microorganismes, Université de Lorraine, Vandoeuvre-lès-Nancy, F-54500 France

Search for more papers by this author

First published: 20 February 2015

https://doi.org/10.1111/1755-0998.12391

Citations: 18

Share a link

Email
Wechat
Bluesky

Abstract

The Périgord black truffle (Tuber melanosporum Vittad.), considered a gastronomic delicacy worldwide, is an ectomycorrhizal filamentous fungus that is ecologically important in Mediterranean French, Italian and Spanish woodlands. In this study, we developed a novel resource of single nucleotide polymorphisms (SNPs) for T. melanosporum using Illumina high-throughput resequencing. The genome from six T. melanosporum geographical accessions was sequenced to a depth of approximately 20×. These geographical accessions were selected from different populations within the northern and southern regions of the geographical species distribution. Approximately 80% of the reads for each of the six resequenced geographical accessions mapped against the reference T. melanosporum genome assembly, estimating the core genome size of this organism to be approximately 110 Mbp. A total of 442 326 SNPs corresponding to 3540 SNPs/Mbps were identified as being included in all seven genomes. The SNPs occurred more frequently in repeated sequences (85%), although 4501 SNPs were also identified in the coding regions of 2587 genes. Using the ratio of nonsynonymous mutations per nonsynonymous site (pN) to synonymous mutations per synonymous site (pS) and Tajima's D index scanning the whole genome, we were able to identify genomic regions and genes potentially subjected to positive or purifying selection. The SNPs identified represent a valuable resource for future population genetics and genomics studies.

Introduction

Single nucleotide polymorphisms (SNPs) have attracted a great deal of interest in the scientific community (Ganal et al. 2009). Already widely used for human and plant genetics, due to technological developments and the subsequent reduced costs of high-throughput sequencing technologies, large-scale SNP identification is also available for filamentous fungi (e.g. Fusarium graminearum, Cuomo et al. 2007; Coccidioides spp., Neafsey et al. 2010; Leptographium longiclavatum, Ojeda et al. 2014; Blumeria graminis, Wicker et al. 2013; Rhizoctonia solani, Hane et al. 2014). The value of SNPs in comparison with microsatellite markers lies in their abundance throughout the genome, their biallelic nature and their high potential for automation (Brumfield et al. 2003). In addition, they are easy to study, follow relatively robust mutation models and are easily genotyped in large panels. SNPs can be used in population analyses for studying demographic and historical patterns based on large numbers of samples. Compared to microsatellites, which are essentially neutral markers that do not permit a high density, an advantage of surveying SNPs throughout the genome is the higher probability of identifying adaptation signatures. For example, a large-scale SNP analysis uncovered an adaptation signature for temperature in Neurospora crassa (Ellison et al. 2011).

In Tuber melanosporum, the genetic diversity and phylogeography have been investigated with different molecular markers such as randomly amplified polymorphism DNA, microsatellites, SNPs in the internal transcribed spacer (ITS) of ribosomal DNA and inter-simple sequence repeats (ISSR; Bertault et al. 1998; Murat et al. 2004; Riccioni et al. 2008; García-Cunchillos et al. 2014). These studies pointed to an important effect of the last glaciations (from 120 000 to 11 000 years ago; Van Andel & Tzedakis 1996) on the truffle population structure. For example, two putative post-glacial recolonization routes were hypothesized using 10 SNPs in ITSs (Murat et al. 2004). Through the use of microsatellites and ISSR fingerprinting, it has also been suggested that glacial refuges exist in Italy and Spain (Riccioni et al. 2008; García-Cunchillos et al. 2014).

Following the publication of the T. melanosporum genome sequencing project (Martin et al. 2010), this species became one of the model species for studying ectomycorrhizal ascomycetes (Kües & Martin 2011). Ectomycorrhizal fungi are an important group of fungi, because they promote the growth of trees in forests and woodlands by providing the trees with water and nutrients (Smith & Read 2010). Using the T. melanosporum genome, highly polymorphic microsatellite makers were developed and used to characterize small-scale spatial genetic diversity in two truffle orchards, identifying a pronounced spatial genetic structure with numerous small-sized genets (Murat et al. 2013b). These results suggested that T. melanosporum relies heavily on sexual reproduction. Microsatellite-based population genetic analyses have allowed for the investigation of a small proportion of the T. melanosporum genome, but with a lower probability of detecting the genomic regions involved in the species' adaptation. Using the whole T. melanosporum genome in combination with high-throughput sequencing technologies, large-scale SNP surveys now make it possible to perform an exhaustive investigation of the genomic variation.

The aim of this study was to assess the overall genetic diversity of T. melanosporum by identifying and mapping the SNPs in resequenced genomes. The genomes of six geographical accessions of T. melanosporum were sequenced using Illumina technology and compared to strain Mel28 as the reference genome (Martin et al. 2010). To improve the chances of finding genetic polymorphisms, the resequenced geographical accessions were from samples in different populations within the northern and southern geographical limits of the species distribution. This SNP resource will be useful for more in-depth investigations of T. melanosporum population structure, gene flow and putative ecotype identifications, as well as of selected genes and genomic regions.

Materials and Methods

Sampling and DNA extraction

Tuber melanosporum Vittad. (Ascomycota, Pezizomycotina, Pezizomycetes, Pezizales, Tuberaceae) is native to France, Italy and Spain. Our sampling strategy aimed to cover the natural geographical range of the species as well as the different climates (Mediterranean and continental) where this truffle is produced. The Mel28 isolate (referred to as France-Pro in this study) used for sequencing the reference genome was harvested in southern France (Saint Rémy de Provence, Bouches du Rhône, France; Martin et al. 2010). Three T. melanosporum were harvested from France (Alps, Burgundy and Alsace), one from Italy (Umbria) and two from Spain (Castilla Leone; Fig. S1; Table 1). Within a few days of harvesting, each ascocarp was shipped to the laboratory, thoroughly washed and the inner section (i.e. gleba) conserved at −20 °C pending the DNA extraction.

Table 1. List of Tuber melanosporum samples analysed, their geographical origin and the climate of their sampling area

Sample name	Code in manuscript	Locality	Region	Country	Climate	Sequence origin
091215-1	Spain-1	Sierra de Alcaraz	Castilla Leone	Spain	Mediterranean	This study
091215-4	Spain-2	Sierra de Vianos	Castilla Leone	Spain	Mediterranean	This study
100104-1	France-Bur	Courban	Burgundy	France	Continental	This study
100120-1	France-Als	Rouffach	Alsace	France	Continental	This study
100122-1	Italy	Perugia	Umbria	Italy	Mediterranean	This study
100303-4	France-Alp	Chorges	Provence-Alpes-Côtes d'Azur	France	Alps (> 1000 m altitude)	This study
mel28	France-Pro	St Rémy de Provence	Provence-Alpes-Côtes d'Azur	France	Mediterranean	Martin et al. (2010)

Total DNA was extracted from 500 mg of gleba using a modified CTAB (cetyl trimethyl ammonium bromide) protocol. After grinding the samples in liquid nitrogen, they were incubated for 30 min at 65 °C in 2.5 volumes of buffer A (0.35 m sorbitol; 0.1 m Tris-HCl, pH 9; and 5 mm EDTA, pH 8), 2.5 volumes of buffer B (0.2 m Tris-HCl, pH 9; 50 mm EDTA, pH 8; 2 m NaCl; and 2% CTAB) and 1 volume of buffer C (5% of Sarkosyl; N-lauroylsarcosine sodium salt) in 50-mL Falcon tubes. After the incubation, a 0.33 volume of potassium acetate (5 m) was added, and the tubes were incubated for 30 min on ice to precipitate the polysaccharides. After centrifugation at 5000 × g for 20 min, the supernatant was purified with 1/10 volume of ammonium acetate (3 m) and 1 volume of chloroform:isoamyl alcohol (24:1) in a Falcon tube and centrifuged at 4000 × g for 10 min. The aqueous phase was transferred to a Nalgene tube (Fisher Scientific, France) and incubated with 100 μL RNase A (10 mg/mL) for 30 min at 37 °C. The DNA was precipitated with a 1/10 volume of ammonium acetate (3 m) and 1 volume of isopropanol at room temperature for 5 min and centrifuged at 10 000 × g for 10 min. The pellet was suspended in 2 mL of QBT buffer (0.75 m NaCl; 50 mm MOPS, pH 7.0; 15% isopropanol; and 0.15% Triton X-100) and purified using Genomic-tip 100/G columns (Qiagen Cat# 10243) following the manufacturer instructions with the exception of the QC buffer (1.35 m NaCl; 50 mm MOPS, pH 7.0; 15% isopropanol). The higher NaCl (1.35 m instead of 1 m) concentration in the QC buffer allowed for the exclusion of small DNA fragments from the column. The purified DNA was then concentrated by precipitation with 1/10 volume of ammonium acetate (3 m) and 1 volume of isopropanol at room temperature for 5 min and centrifuged at 10 000 × g for 10 min. After discarding the supernatant, the pellet was resuspended in 100 μL of TE buffer and stored at −20 °C.

Riccioni et al. (2008) showed that the gleba of T. melanosporum is formed by a haploid maternal mycelium. The DNA isolated from each ascocarp by the described protocol is therefore expected to correspond to a haploid mycelium, as no disrupted spores were observed when checked under a microscope (data not shown).

Whole-genome shotgun sequencing and mapping

Each of the six geographical accession DNAs was sequenced in one lane of an Illumina Genome Analyzer (GAII) at the Beckman Genomics facilities (Brea, CA, USA). Sequencing produced approximately 1.1 Gb of 76-bp single-end reads per sample, and the sequencing depth ranged from 21- to 24-fold (Table 2). The raw data reads can be accessed in the sequence reads archive at the National Center for Biotechnology (NCBI) under the Accession No SRP044130.

Table 2. Illumina sequencing and Burrow-Wheeler Aligner (BWA) mapping statistics

Samples	Read number	Total mapped reads after filteringa		Reads mapping to multiple locationsb		Genome reference coveragec		Number of genesd
Samples	Read number	Read Number	%	Number of reads	%	Number of bp	%	Number of genesd
Spain-1	39 275 496	28 744 980	73.19	2 376 408	6.05	113 139 168	91.5	9816
Spain-2	38 003 850	29 780 727	73.62	2 419 174	6.37	113 282 964	91.7	9808
France-Bur	38 921 450	31 145 093	80.02	2 427 651	6.24	114 066 611	92.3	9800
France-Als	34 575 334	26 833 541	77.61	2 073 232	6.00	114 998 792	93.1	9831
Italy	39 184 077	29 780 081	76.00	2 323 320	5.93	114 004 497	92.3	9807
France-Alp	38 597 308	30 651 292	79.41	2 474 195	6.41	113 860 556	92.2	9792

^a Excluding low-quality reads and reads mapping to multiple locations.
^b These reads were eliminated for the SNP identification.
^c Number of base pairs and the percentage of the reference genome mapped by reads. Excluding the Ns, the reference genome is composed of 123 535 220 bp.
^d Number of gene models mapped for each genome. A gene model was considered present if at least 60% of its sequence was covered by reads. The total number of gene models in the reference genome is 9952 (9765 are unique to France-Pro).

The raw reads for each genome were aligned to the France-Pro reference genome available at the Institut National de la Recherche Agronomique (INRA) Tuber genome database (http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=fast) using the Burrow-Wheeler Aligner (bwa) software, version 0.7.3a (Li & Durbin 2009); with the exception of the number of mismatches between a read and the reference genome, which was set to two, the aln/samse algorithm and default parameters were used. As BWA generates a read mapping quality in phred scaled (MAPQ) according to the read quality, the raw reads were not quality filtered before mapping. To avoid low-quality mapping, only reads mapped with an MAPQ above 25 were considered for analysis with SAMtools (v. 0.1.18; Li et al. 2009). This stringent parameter eliminated the reads of low sequencing quality and those mapped at several genomic locations, thereby avoiding problems in the SNP calling due to the higher proportion of repeated sequences such as transposable elements (TE) in the T. melanosporum genome (Martin et al. 2010). The genomic regions without mapped reads were assigned to the different genomic compartments (i.e. genes, TE and intergenic regions) defined by Martin et al. (2010). We considered genes with more than 60% of their sequences without mapped reads as missing.

In this study, we produced an updated gene model repertoire of the T. melanosporum genome. Indeed, transcriptomic analyses have suggested that the 7496 high-confidence protein-coding genes supported by either sequence similarity, the occurrence of Pfam or KOG domains, or oligoarray expression data (Martin et al. 2010; available at http://mycor.nancy.inra.fr/IMGC/TuberGenome/index.html) omitted several expressed genes (Tisserant et al. 2011; A. Kohler, E. Tisserant and F. Martin, unpublished data). Moreover, we could not exclude that gene model families were considered as repeated sequences and excluded from this high-confidence protein-coding gene repertoire. To update the gene model repertoire, we began with the initial 12 826 putative gene models identified by GAZE and discarded those gene models that (i) overlapped with known TEs (Martin et al. 2010), (ii) had more than 40% unknown bases (N), (iii) had homology with Repbase (Jurka et al. 2005) or (iv) were <20 amino acids in length. The 1315 genes that were manually curated (Martin et al. 2010) served as a validation set. Twelve percentage of the T. melanosporum genome was covered by uncategorized repeated sequences (so-called no cat) lacking homology with known TE families that could code for either T. melanosporum-specific TE or proteins belonging to orphan multigenic families. Gene models overlapping these no cat sequences were retained in the new set of 9952 gene models (Table S1). The expression levels on NimbleGen microarrays and RNAseq for each of the genes in the repertoire were determined for the ectomycorrhizae, free-living mycelium and ascocarps (Martin et al. 2010; Tisserant et al. 2011). The homology of each gene model searched against the NCBI nr database (September 2013) and UniProt (UniProtKB/Swiss-Prot of September 2013) was computed using blastp (v2.2.28+) with an e-value threshold of 10⁻⁵ (Altschul 1990). Each gene was also analysed for Pfam motifs using the hmmscan command of the HMMER package (Eddy 2011).

SNP calling and localization in the genome

SNP calling was performed with two different methods: (i) BWA for the alignment and SAMtools (Li et al. 2009) for the SNP calling (referred to as the BWA/SAMtools method) and (ii) the clc Genomics Workbench version 6.6 (http://www.clcbio.com) for both the alignment and calling (referred to as the CLC method).

For the BWA/SAMtools method, a pile-up file (i.e. file describing the mapping results information at each chromosomal position) was created with the SAMtools mpileup command using the bam alignment output generated by BWA (see above). The SNP calling was filtered with the vcfutils script (available with SAMtools); to be validated, each SNP was required to be supported by at least ten reads, and the root mean square (RMS) of the mapping quality of the SNP position had to be ≥25.

For the CLC method, the reads were mapped using a global alignment with the length fraction set to 1, the similarity fraction set to 0.97 and nonspecific reads ignored. All other parameters were set by default (http://www.clcbio.com/files/usermanuals/CLC_Genomics_Workbench_User_Manual.pdf). The SNPs were called by the quality-based variant detection module ignoring variants in nonspecific regions and using default parameters (i.e. minimum coverage of 10 reads). The maximum expected variation (ploidy) was set to 1, because haploid genomes had previously been sequenced (see above).

For both pipelines (BWA/SAMtools and CLC), we created a file for each sequenced genome with the SNPs localized on the reference genome assembly (France-Pro). The two sets of SNPs identified for each genome were compared, and only the SNPs called by both software methods were considered for further analyses.

All of the SNPs identified by aligning the six geographical accessions against the reference genome were compiled to generate a gff-formatted file available in DRYAD (doi:10.5061/dryad.9gk52). The SNPs were localized according to the new protein-coding gene catalogue defined in this study (see above) and repeated sequences library defined in Martin et al. (2010) using python scripts available at the INRA Tuber genome portal using the following link (http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=anno).

Polymorphism indices and detecting selection pressure

The level of polymorphism between genomes was assessed by the π index (Nei & Li 1979) calculation in a sliding window of 10 kb throughout the entire reference genome using egglib version 2.1.6 (De Mita & Siol 2012). The π index corresponds to the average number of nucleotide differences per site between two DNA sequences from the sample population. In the sliding windows of 10 kb throughout the entire genome, we also computed the Tajima's D (Tajima 1989) and Waterson Theta (Watterson 1975) values. The ratios of nonsynonymous mutations per nonsynonymous sites (pN) and synonymous mutations per synonymous sites (pS) in gene models were calculated to assess the mutations for deviation from neutral evolution. Positive Tajima's D values are typically attributed to diversifying, balancing or positive selection, whereas negative values are generally attributed to positive or purifying selection (Weedall & Conway 2010); they have an expected normal distribution between −2 and +2 for a 95% confidence interval (Tajima 1989; Carlson et al. 2005). Therefore, in this study, we considered values >+2 or less than −2 as significant. The Tajima's D value was calculated for all (i) of the gene models and (ii) in a sliding window of 10 kb throughout the whole genome (including the coding and noncoding regions). Only gene models with at least five SNPs in their coding regions were considered for the pN/pS ratio calculation. A comparison between the two indices allowed us to identify candidate gene models subject to positive selection if pN/pS was >1 and purifying selection if pN/pS was <1.

Phylogenetic reconstruction and divergence time among geographical accessions

To construct a phylogeny of the seven T. melanosporum samples, the 60 507 SNPs present in the intergenic regions free of selective pressure (excluding repeated sequences and genomic regions with a Tajima's D index above 2) were selected. A maximum-likelihood phylogenetic tree was built using the default parameters of PhyML (Guindon et al. 2010) with 100 bootstrap replicates. To investigate the minimal number of SNPs suitable for a population genetic analysis, subsets of 10, 100, 1000, 5000, 10 000, 15 000, 20 000, 25 000, 30 000, 35 000, 40 000 and 50 000 SNPs were randomly selected 100 times among the 60 507 SNPs free of selection, and 100 maximum-likelihood phylogenetic trees were built using the PhyML default parameters. The Robinson–Foulds distance (Robinson & Foulds 1981) was used to measure the distance between each generated phylogenetic tree and the reference tree generated for the whole set of 60 507 SNPs using RF.dist in the phangorn R library (Schliep 2011). For each subset of SNPs, the number of trees identical to the reference tree was calculated as the number with a Robinson–Foulds distance equal to 0, meaning that the two trees are identical.

The divergence time estimates were performed with the 60 570 SNPs present in the intergenic regions free of selective pressure as described by Wicker et al. (2013). For the calculation, we assumed that all of the SNPs were present in regions that had accumulated mutations at the same rate, and we used a rate of 1.3 E⁻⁸ (± 2.29 E⁻⁹) substitutions per site per year as originally proposed by Ma & Bennetzen (2004) for rice. This choice was justified by the common use of this mutation rate for fungi as was done by Wicker et al. (2013) to estimate the divergence among isolates of B. graminis, another ascomycete fungus. It is also in the range of mutation rates (0.09 E⁻⁸ to 1.67 E⁻⁸ substitutions per site per year) proposed by Kasuga et al. (2002) for fungi. The number of SNPs in the genomic regions free of selection pressure was used to estimate the time since the most recent common ancestor (MRCA) for all geographical accessions. Using all of the SNPs in the genomic regions free of selective pressure, a Bayesian phylogenetic analysis was conducted with 5 000 000 generations, a sampling for each 1000 generations, and a burn-in value of 1250 with beast version v1.7.5 (Drummond & Rambaut 2007) using the Hasegawa–Kishino–Yano (HKY) DNA substitution model. The estimated age of the MRCA for the tree was used to estimate the different node ages using a relaxed clock model with uncorrelated exponential prior distribution levels. The exponential relaxed clock model was chosen because it had been used previously for Tuberaceae by Bonito et al. (2013).

Results and Discussion

Geographical accessions: resequencing and read mappings

Among the approximately 34–39 million reads generated from each genome, 73–80% mapped to a unique position against the reference genome using BWA (Table 2). The raw-read data were deposited in the NCBI sequence reads archive under Accession No SRP044130. The average sequencing quality was good (quality score ≥20), as illustrated in Fig. S2, and the low-quality reads were discarded from the mapping for future analyses. The read coverage throughout the genome was continuous with an average depth of approximately 20× and did not reveal any important differences between the protein-coding and repeated sequences (Fig. S3). This likely results from the stringent parameters used for the read mapping and the post-processing step in which the reads mapped at several genomic locations were eliminated. Indeed, when multiple mapping was possible, it was shown that an increased density of mapped reads correlates with the location of repeated sequences in Pyrenophora tritici-repentis (Manning et al. 2013).

Between 91% (Spain-1) and 93% (France-Als) of the France-Pro (Mel28) reference genome was covered by reads (Table 2), and the T. melanosporum core genome was estimated at approximately 110 Mbp. The small proportion of the reference genome not covered by reads corresponds in large part to repeated sequences (approximately 80%; Fig. S4). TEs, primarily gypsy retrotransposons, were over-represented in the genomic regions not covered (Fig. S4). A total of 187 of the 9952 protein-coding gene models identified in the reference genome were found in the uncovered genomic regions (Table 2), with a maximum of 160 genes found for the France-Alp geographical accession. Among these 187 genes, microarray and RNAseq expression data showed that 73 and 112, respectively, were expressed in at least one tissue (Table S1; Martin et al. 2010; Tisserant et al. 2011). The regions that were not covered by mapped reads may correspond to (i) regions absent in the resequenced genomes; (ii) regions present in the resequenced genomes, but highly polymorphic, thus preventing proper read mapping; or (iii) regions rich in repeated sequences that prevent nonambiguous mapping. Unfortunately, the sequencing strategy utilized (i.e. single-end sequencing) was not suitable to address these genomic regions in more detail, because the de novo assembly of new resequenced genomes was not possible.

The truffle is a heterothallic species harbouring one of two mating type idiomorphs (i.e. MAT1-1 or MAT1-2) in its haploid genome (Martin et al. 2010), and the MAT1-2 idiomorph is present in Scaffold 247 of the France-Pro reference genome (Rubini et al. 2011). The resequenced genomes from the geographical accessions Spain-1, France-Als and France-Alp contained reads matching the MAT1-2 idiomorph in the France-Pro reference genome (Fig. S3B), while the genomes of the geographical accessions Spain-2, France-Bur and Italy lack these sequences, suggesting they harbour the MAT1-1 idiomorph. This was confirmed by mapping the Illumina reads against the known sequence of the MAT1-1 idiomorph (data not shown). These results confirmed that either of the two mating type idiomorphs is present in the T. melanosporum haploid genome (Rubini et al. 2011).

SNP identification

The SNP calling performed with the BWA/SAMtools and CLC Genome Workbench programs produced similar results (Fig. 1), with 93% (442 326 SNPs corresponding to 3540 SNPs/Mbps) of those called by BWA/SAMtools also being called by CLC. As proposed by Zhan et al. (2011), only SNPs called by both methods were retained. The gff file with this SNPs resource can be downloaded in DRYAD (doi:10.5061/dryad.9gk52) and in our institution website following this link (http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=anno).

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

Venn diagram of the number of single nucleotide polymorphisms called by both the Burrow-Wheeler Aligner/SAMtools and CLC Genomic Workbench methods.

A comparison of each of the T. melanosporum resequenced genomes to the France-Pro reference genome identified between 108 112 and 198 788 SNPs, with the density ranging from 865 to 1591 SNPs/Mbp for the geographical accessions France-Als and Spain-2, respectively (Table 3). According to Fumagalli et al. (2013), SNPs identified using high-throughput sequencing technologies should be considered with caution, particularly when the sequencing depth is low (<10×). Here, the SNPs retained for analyses were identified using two different programs, and together with the stringent mapping that limited the multiple mapping of reads and the 10× read depth required for SNP identification, likely limited spurious SNPs (Li et al. 2009; Fumagalli et al. 2013). However, additional whole-genome sequencing or the targeted sequencing of SNP-rich regions will be conducted to experimentally confirm the existence of these in silico SNPs.

Table 3. The number of single nucleotide polymorphisms (SNPs) and their distribution in the different genomic regions

Samples	Introns		Exons		Untranslated regions (UTRs)		Repeated sequencesa		Other genomic regions		Total
Samples	Number	SNP/Mbp	Number	SNP/Mbp	Number	SNP/Mbp	Number	SNP/Mbp	Number	SNP/Mbp	Number	SNP/Mbp
Spain-1	2727	354	1816	155	102	179	139 824	1961	22 550	670	167 019	1337
Spain-2	3144	409	2050	175	119	209	165 981	2327	27 494	817	198 788	1591
France-Bur	2569	334	1529	131	84	148	110 806	1554	18 295	544	133 283	1067
France-Als	1948	253	1282	110	69	121	89 413	1254	15 400	458	108 112	865
Italy	2288	297	1478	126	91	160	106 57	1494	16 570	492	126 997	1016
France-Alp	1952	254	1332	114	77	135	98 933	1387	15 363	456	117 657	942
Totalb	6795	883	4501	385	252	443	374 268	5248	56 510	1679	442 326	3540

^a Repeated sequences comprising known transposable elements and uncategorized elements.
^b The total number of SNPs excluding redundancy.

The SNP density in filamentous fungi varied from 291 to 14 005 SNPs/Mbp for the Fusarium graminearum (Cuomo et al. 2007) and Rhizoctonia solani (Hane et al. 2014) geographical accessions, respectively (Table 4). The differences in the nucleotide polymorphism levels observed could reflect differences in the demographic history of these species (e.g. reduction of polymorphism due to population bottlenecks), evolutionary trends related to the lifestyles of these fungi (e.g. for pathogenic species) as well as their respective ratios between sexual and asexual reproduction. T. melanosporum, with 3540 SNPs/Mbp, has a genetic diversity level in the lower range for filamentous fungi (Table 4), although we are aware that the parameters used to call the SNPs and the number of samples differed between the studies.

Table 4. Polymorphism levels estimated by whole-genome resequencing in filamentous fungi

Species	Phylum	Number of sequenced strains	SNPs/Mbp	Reference
Blumeria graminis	Ascomycota	2	1000	Hacquard et al. (2013)
Coccidioides immitis	Ascomycota	10	5251	Neafsey et al. (2010)
Coccidioides posadasii	Ascomycota	10	9227	Neafsey et al. (2010)
Fusarium graminearum	Ascomycota	2	291	Cuomo et al. (2007)
Leptographium longiclavatum	Ascomycota	71	975	Ojeda et al. (2014)
Neurospora crassa	Ascomycota	48	3375	Ellison et al. (2011)
Tuber melanosporum	Ascomycota	7	3540	This study
Lentinula edodes	Basidiomycota	2	4629	Au et al. (2013)
Melampsora larici-populina	Basidiomycota	15	6051	Persoons et al. (2014)
Puccinia graminis	Basidiomycota	1a	1843	Duplessis et al. (2011)
Puccinia striiformis	Basidiomycota	1a	5980	Cantu et al. (2013)
Rhizoctonia solani	Basidiomycota	2	14005	Hane et al. (2014)
Rhizophagus irregularis	Glomeromycota	6	321	Lin et al. (2014)

^a For these species, the SNPs were identified in one dikaryotic strain.

SNPs are not distributed equally in the genome

The polymorphism index (π; Nei & Li 1979) was calculated along the genome in 10 Kb sliding windows, which showed some genomic regions to be more polymorphic than others (Fig. 2). Indeed, the SNPs were not distributed equally in the genome, and as expected, they occurred more frequently in repeated sequences than in protein-coding genes (Table 3; Figs. 2 and S5). Most of the SNPs were found in repeated sequences (84.6%) that represented 57.7% of the T. melanosporum genome (Martin et al. 2010). The SNPs were more frequent in gypsy retrotransposons than in DNA transposons (Fig. S5). A bias in the SNP distribution was also observed in the F. graminearum fungal genome, where 50% of the SNPs were present in 13% of the genome (Cuomo et al. 2007). Large blocks of regions rich in SNPs were also identified in the B. graminis genome (Hacquard et al. 2013; Wicker et al. 2013) and in the poplar leaf rust Melampsora larici-populina, in which a large portion of the variants were identified in coding sequences (Persoons et al. 2014).

Several mechanisms are known to inactivate transposons in filamentous fungi (Murat et al. 2013a), and some such as repeat-induced point mutations (RIPs) introduce mutations in these sequences (Selker et al. 1987). While genes involved in RIPs were not identified in the T. melanosporum genome (Martin et al. 2010), a strong preference for transitions in the CpG dinucleotide was observed by Clutterbuck (2011). Recently, Montanini et al. (2014) found that the methylation pattern in T. melanosporum selectively targets TEs rather than genes, and their results strongly favour methylation induced premeiotically (MIP) as the process responsible for TE silencing in T. melanosporum. Interestingly, MIP can increase the mutation rate of the methylated cytosines, as documented for mammalian DNA (Kricker et al. 1992). The SNPs were more frequently found in gypsy retrotransposons. Interestingly, these elements colonized the T. melanosporum genome several millions years ago (Martin et al. 2010), and their SNP richness can be explained by their old age, as SNPs in these regions tend to accumulate due to DNA decay (Lisch & Bennetzen 2011). The mapping of reads in multiple locations was low (approximately 6%; Table 2), although almost 60% of the T. melanosporum genome corresponds to repeated sequences, suggesting that the different TE copies are not conserved.

SNPs in gene models

A total of 903 protein-coding genes presented with more than two SNPs in their untranslated regions (UTRs), introns and/or exons. Among these, 742 had SNPs in their coding regions, including 584 nonsynonymous mutations (Tables S1 and S2). The 20 gene models with the highest number of SNPs (>10) in their coding regions are shown in Table S3. Most have sequence similarities in the DNA databases, and four are paralogues coding for the same HET-E-1 protein. Five genes were not expressed in any tissue, although in comparison with the free-living mycelium, four were upregulated in fruiting bodies (coding for an alpha-/beta-glucosidase, an alpha-glucosidase 2, a methylene-tetrahydrofolate reductase 2 and an ankyrin repeat-containing protein), as were three in the ectomycorrhizal root tips (coding for a vegetative incompatibility protein HET-E-1, an alpha/beta-glucosidase and an alpha-glucosidase 2). The same putative alpha-/beta-glucosidase and alpha-glucosidase 2 were upregulated in both tissues (Table S3). For the genes with SNPs, no enrichment of specific metabolic pathways was detected (data not shown).

When compared with the gene contents of the 30 other ectomycorrhizal genomes sequenced under the framework of the Mycorrhizal Genome Initiative (Martin et al. 2011), T. melanosporum has a restricted gene content (9952 protein-coding genes). Indeed, the number of gene models per species ranged from 30 282 for Rhizophagus irregularis down to 9952 for T. melanosporum (http://genome.jgi-psf.org/Mycorrhizal_fungi/Mycorrhizal_fungi.info.html?core=genome&query=%22groups:Mycorrhizal_fungi%22&searchType=Keyword). Thus, T. melanosporum genes may have experienced purifying selection at a higher rate in comparison with the species with larger gene repertoires and a higher number of gene families (i.e. functional redundancy). In T. melanosporum, 2.6% of the SNPs were found in protein-coding genes and 1% in coding sequences, which was less than found in B. graminis and the Coccidioides spp. Wicker et al. (2013) found between 3.7% and 3.9% of the SNPs in the coding regions for B. graminis, and Neafsey et al. (2010) identified 33–36% of the SNPs in genes (UTRs, introns and exons) when they compared the genomes of C. immitis and C. posadasii. This suggests that the limited gene repertoire of T. melanosporum is associated with higher functional constraints and consequently presents a lower rate of genetic variation.

Detecting selection pressure

Two approaches were used to identify selection signatures. First, the rates of nonsynonymous mutations per nonsynonymous site (pN) and synonymous mutations per synonymous site (pS) were calculated for the 119 genes with five or more SNPs. Of those, 18 genes had a pN/pS ratio >1 and 9 had only nonsynonymous mutations (20 expressed in at least one tissue), suggesting they were under positive selection. On the other hand, 78 genes had a pN/pS ratio <1 (68 expressed in at least one tissue), suggesting they were under purifying selection (Tables S4 and S5). The second approach relies on the Tajima's D statistic computed on either the gene models or using a whole-genome sliding window. Here, we considered values >+2 and lower than −2 as significant. When calculated for the gene models, only four had a Tajima's D value >+2, suggesting the existence of a signature for balancing or positive selection (Table S1). These genes, coding for the vegetative incompatibility protein HET-E-1, an NADH-ubiquinone oxidoreductase, a vacuolar protein and a protein kinase, were expressed in the different tissues (Table S1). When the Tajima's D statistic was calculated by scanning the whole genome along a sliding window, 36 genomic regions with a Tajima's D value >+2 were identified (Fig. 2). Thirty-one gene models were present in these 36 genomic regions (Table S6), including the previously identified vegetative incompatibility protein HET-E-1.

The positive Tajima's D values can result not only from balancing selection, but also from population structure and moderately intense bottlenecks (i.e. a reduction in the size of the population; Biswas & Akey 2006). In addition to significant population structure effects (Murat et al. 2004; Riccioni et al. 2008; García-Cunchillos et al. 2014), a population bottleneck due to the last glaciation has also been proposed for T. melanosporum (Bertault et al. 1998). The six T. melanosporum geographical accessions were harvested from different populations, as demonstrated by the phylogeographical analysis (see below). Therefore, we cannot exclude that the high positive Tajima's D values observed resulted from a population bottleneck and/or population structure rather than from balancing selection. These results are preliminary and need to be confirmed by sequencing a larger number of genomes, but they open the way for future investigations of truffle adaptation to environmental stresses.

Phylogeography and divergence time among geographical accessions

The genomic regions putatively free of selection covered 36.6 Mbp, for a total of 60 507 SNPs. This set of SNPs can be downloaded in DRYAD (doi:10.5061/dryad.9gk52) and in our institution website following this link (http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=anno).

The unrooted maximum-likelihood phylogenetic tree clustered together samples according to their geographical origin with a cluster comprised of the northern France samples (France-Als and France-Bur), a cluster grouping samples from south-eastern France (France-Alp and France-Pro) and Italy, and another cluster with the Spanish samples (Fig. S1B). This ability to identify the geographical origin of truffles harvested in natural populations using SNPs is currently being used to design diagnostic SNP arrays for geographical certification. As highlighted by Davey et al. (2011), genotyping SNPs across targeted populations is now facilitated by the advent of high-throughput SNP arrays. Indeed, depending on the sample size and the number of SNPs to be analysed, medium- to high-throughput technologies are available such as the competitive allele-specific PCR (KASPar) assay from KBiosciences (Hertfordshire, UK; http://www.kbioscience.co.uk) or the Affymetrix Axiom SNP microarrays. The KASPar assay is commonly used for genotyping up to 1000–2000 SNPs, while the Axiom SNP microarrays allow genotyping from 1500 to several million SNPs. As the financial investment for genotyping with KASPar can be three times less expensive than for the Axiom SNP microarray (Charles Poncet, INRA Gentyane Plateform, personal communication), the minimum number of SNPs required for a population genetic analysis was investigated. We found that a minimum of 30 000 SNPs is required to generate all of the maximum-likelihood trees identical to the reference tree produced with the 60 507 SNPs free of selection (Fig. 3). We are thus developing an array based on the 60 507 SNPs for analysing the population genetic structure throughout the natural regions of T. melanosporum production.

Using the mutation rate of 1.3 E⁻⁸ (± 2.29 E⁻⁹) substitutions per site per year (Ma & Bennetzen 2004; Wicker et al. 2013), we estimated that the 60 507 mutations had accumulated between 107 703 and 154 763 years ago (131 128 ± 23 098 years). These times were used to set the estimated time of the MRCA for the Bayesian phylogenetic reconstruction generated with the 60 507 SNPs free of selection and a relaxed molecular clock. This Bayesian reconstruction clustered the French and the Italian samples together, while the Spanish samples separated earlier (Fig. 4). The Bayesian and maximum-likelihood phylogenies exhibit a single difference in their topology: the France-Pro, France-Alp and Italian samples form a monophyletic cluster in the maximum-likelihood phylogeny, but are paraphyletic in the Bayesian phylogeny (Fig. S6). This could be explained not only by the different methods used in the analyses (Bayesian versus maximum-likelihood), but also by the fact that one phylogeny is time-dependent (relaxed molecular clock in Bayesian), while the other is time-independent. That time-dependent and time-independent phylogenies are not always in agreement has been previously discussed (Drummond et al. 2006). The phylogenetic signal is also likely to be inconsistent across the genome due to the historical proximity of the samples, which increases the chances of finding different trees with different methods. For outbreeding species, such as T. melanosporum, the phylogenetic signal obtains with SNPs could be weaken due to population genetic processes such as recombination and gene flow. However, both topologies are consistent with the geography: in the Bayesian phylogeny, the French south-eastern samples appear as intermediates between the northern French and the Italian and Spanish samples, while they appear as a separate group in the maximum-likelihood topology (Fig. S6). A further characterization of the overall structure of the T. melanosporum populations could be performed using population genetics methods, but they require a larger number of samples to be powerful.

The Bayesian reconstruction suggested that the Spanish samples separated earlier than the French and Italian samples (Fig. 4). Time calibrations and date estimations should be considered with caution, especially for studies without fossil data and incomplete taxon sampling. Thus, we are aware that we cannot use the absolute dates we obtained, but only relative estimates. However, it is highly probable that the MRCA predates the last glaciation (120 000 to 11 000 years ago; Van Andel & Tzedakis 1996). While our sampling study is not sufficient alone to definitively describe the history of T. melanosporum following the last glaciation, this preliminary analysis paves the way for future analyses using the current SNPs resource as we have proposed.

Conclusions

Today, T. melanosporum is primarily harvested in truffle orchards in France using tree seedlings that have been inoculated with truffles in greenhouses (Olivier et al. 2012). Up to now, the selection of geographically defined sources of truffle inoculum has not been considered for plantations. Interestingly, the Aquitaine regional truffle growers' federation has initiated the production of inoculated plants using truffles sampled from specific natural populations that have appeared to be better adapted to drought or frost (P. Rejou, personal communication). To date, the selection of these truffle populations has been empirical and based only on field observations; this approach could now be validated by genotyping the truffle inocula. Moreover, it is now recognized that truffle aroma has, at least in part, a genetic origin (Martin et al. 2010). Investigating a putative phenotype association between SNP markers and traits (such as particular aroma or stress tolerance) can now be contemplated thanks to the current SNPs resource, which is the first step towards a marker-assisted selection of the fungal inocula used by truffle growers.

Acknowledgements

The UMR1136 is supported by a grant overseen by the French National Research Agency (ANR) as part of the Investments for the Future Programme (ANR-11-LABX-0002-01, Lab of Excellence ARBRE). This study benefited from ANR SYSTERRA SYSTRUF (ANR-09-STRA-10). Thibaut Payen's PhD scholarship is cofunded by the Lorraine Region and the European Commission through the EcoFINDERS project (FP7-264465). We would like to thank Francesco Paolocci, Bernard Vonfli, Mario Honrubia, Luc Bernard and Henri Frochot for providing the samples analysed in this study. We also would like to thank Sébastien Duplessis and François Le Tacon for their constructive advice and helpful discussions. Finally, the authors would like to thank the team of American Manuscript Editors for the language and style editing of the manuscript.

Supporting Information

Filename

Description

men12391-sup-0001-FigS1-S6.docxWord document, 6.6 MB

Fig. S1 (A) Geographical localization of the seven T. melanosporum geographic accessions generated with Google Earth (http://www.google.fr/intl/fr/earth/).

Fig. S2 Quality of the raw reads for the six re-sequenced genomes.

Fig. S3 Mapping the reads for the six re-sequenced genomes against France-Pro (reference genome) for (A) Scaffold 1 and (B) Scaffold 247.

Fig. S4 Percentage of the different France-Pro genomic regions without mapped reads in the different genomic regions.

Fig. S5 Single nucleotide polymorphisms distribution in the different genomic regions.

Fig. S6 Topology comparison of the Bayesian (left) and maximum-likelihood (right) phylogenies.

men12391-sup-0002-TableS1-S6.xlsMS Excel, 6.6 MB

Table S1 Summarization of the number of polymorphisms (SNPs) and the different diversity and adaptation indices in the 9952 T. melanosporum protein-coding gene models, as well as their expression levels from RNAseq (Tisserant et al. 2011) and NimbleGen microarrays in the ectomycorrhizae (ECM), fruiting bodies (FB), and free-living mycelium (FLM; Martin et al. 2010).

Table S2 The 742 gene models that have more than two SNPs in their untranslated region (UTR), exon, and intron sequences and at least one SNP in their exon sequence.

Table S3 The 20 protein-coding genes with the highest number of SNPs in their coding regions.

Table S4 The gene models putatively under positive selection (i.e. ratio of non-synonymous to synonymous SNPs (pN/pS) above 1).

Table S5 The gene models putatively under diversifying selection (i.e. ratio of non-synonymous to synonymous SNPs (pN/pS) below 1).

Table S6 The 31 gene models present in the genomic regions exhibiting a Tajima's D statistic >2.

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

References

Altschul S (1990) Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.
10.1016/S0022-2836(05)80360-2
CAS PubMed Web of Science® Google Scholar
Au CH, Cheung MK, Wong MC et al. (2013) Rapid genotyping by low-coverage resequencing to construct genetic linkage maps of fungi: a case study in Lentinula edodes. BMC Research Notes, 6, 307.
10.1186/1756-0500-6-307
PubMed Google Scholar
Bertault G, Raymond M, Berthomieu A et al. (1998) Trifling variation in truffles. Nature, 394, 734.
10.1038/29428
CAS Web of Science® Google Scholar
Biswas S, Akey JM (2006) Genomics insights into positive selection. TRENDS in Genetics, 22, 437–446.
10.1016/j.tig.2006.06.005
CAS PubMed Web of Science® Google Scholar
Bonito G, Smith ME, Nowak M et al. (2013) Historical biogeography and diversification of truffles in the tuberaceae and their newly identified southern hemisphere sister lineage. PLoS One, 8, e52765.
10.1371/journal.pone.0052765
CAS PubMed Web of Science® Google Scholar
Brumfield RT, Beerli P, Nickerson DA, Edwards SV (2003) The utility of single nucleotide polymorphisms in inferences of population history. Trends in Ecology & Evolution, 18, 249–256.
10.1016/S0169-5347(03)00018-1
Web of Science® Google Scholar
Cantu D, Segovia V, MacLean D et al. (2013) Genome analyses of the wheat yellow (stripe) rust pathogen Puccinia striiformis f. sp. tritici reveal polymorphic and haustorial expressed secreted proteins as candidate effectors. BMC Genomics, 14, 270.
10.1186/1471-2164-14-270
CAS PubMed Web of Science® Google Scholar
Carlson CS, Thomas DJ, Eberle MA et al. (2005) Genomic regions exhibiting positive selection identified from dense genotype data. Genome Research, 15, 1553–1565.
10.1101/gr.4326505
CAS PubMed Web of Science® Google Scholar
Clutterbuck JA (2011) Genomic evidence of repeat-induced point mutation (RIP) in filamentous ascomycetes. Fungal Genetics and Biology, 48, 306–326.
10.1016/j.fgb.2010.09.002
CAS PubMed Web of Science® Google Scholar
Cuomo CA, Güldener U, Xu J-R et al. (2007) The Fusarium graminearum genome reveals a link between localized polymorphism and pathogen specialization. Science, 317, 1400–1402.
10.1126/science.1143708
CAS PubMed Web of Science® Google Scholar
Davey JW, Hohenlohe PA, Etter PD et al. (2011) Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics, 12, 499–510.
10.1038/nrg3012
CAS PubMed Web of Science® Google Scholar
De Mita S, Siol M (2012) EggLib: processing, analysis and simulation tools for population genetics and genomics. BMC genetics, 13, 27.
10.1186/1471-2156-13-27
PubMed Web of Science® Google Scholar
Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology, 7, 214.
10.1186/1471-2148-7-214
CAS PubMed Web of Science® Google Scholar
Drummond AJ, Ho SY, Phillips MJ et al. (2006) Relaxed phylogenetics and dating with confidence. PLoS biology, 4, e88.
10.1371/journal.pbio.0040088
CAS PubMed Web of Science® Google Scholar
Duplessis S, Cuomo CA, Lin Y-C et al. (2011) Obligate biotrophy features unraveled by the genomic analysis of rust fungi. Proceedings of the National Academy of Sciences, 108, 9166–9171.
10.1073/pnas.1019315108
CAS PubMed Web of Science® Google Scholar
Eddy SR (2011) Accelerated profile HMM searches. PLoS Computational Biology, 7, e1002195.
10.1371/journal.pcbi.1002195
CAS PubMed Web of Science® Google Scholar
Ellison CE, Hall C, Kowbel D et al. (2011) Population genomics and local adaptation in wild isolates of a model microbial eukaryote. Proceedings of the National Academy of Sciences of the United States of America, 108, 2831–2836.
10.1073/pnas.1014971108
CAS PubMed Web of Science® Google Scholar
Fumagalli M, Vieira FG, Korneliussen TS et al. (2013) Quantifying population genetic differentiation from next-generation sequencing data. Genetics, 195, 979–992.
10.1534/genetics.113.154740
PubMed Web of Science® Google Scholar
Ganal MW, Altmann T, Röder MS (2009) SNP identification in crop plants. Current opinion in plant biology, 12, 211–217.
10.1016/j.pbi.2008.12.009
CAS PubMed Web of Science® Google Scholar
García-Cunchillos I, Sánchez S, Barriuso JJ, Pérez-Collazos E (2014) Population genetics of the westernmost distribution of the glaciations-surviving black truffle Tuber melanosporum. Mycorrhiza, 24, 89–100.
10.1007/s00572-013-0540-9
PubMed Web of Science® Google Scholar
Guindon S, Dufayard JF, Lefort V et al. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic Biology, 59, 307–321.
10.1093/sysbio/syq010
CAS PubMed Web of Science® Google Scholar
Hacquard S, Kracher B, Maekawa T et al. (2013) Mosaic genome structure of the barley powdery mildew pathogen and conservation of transcriptional programs in divergent hosts. Proceedings of the National Academy of Sciences, 110, E2219–E2228.
10.1073/pnas.1306807110
CAS PubMed Web of Science® Google Scholar
Hane JK, Anderson JP, Williams AH et al. (2014) Genome sequencing and comparative genomics of the broad host-range pathogen Rhizoctonia solani AG8. PLoS Genetics, 10, e1004281.
10.1371/journal.pgen.1004281
PubMed Web of Science® Google Scholar
Jurka J, Kapitonov VV, Pavlicek A et al. (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and Genome Research, 110, 462–467.
10.1159/000084979
CAS PubMed Web of Science® Google Scholar
Kasuga T, White TJ, Taylor JW (2002) Estimation of nucleotide substitution rates in eurotiomycete fungi. Molecular Biology and Evolution, 19, 2318–2324.
10.1093/oxfordjournals.molbev.a004056
CAS PubMed Web of Science® Google Scholar
Kricker MC, Drake JW, Radman M (1992) Duplication-targeted DNA methylation and mutagenesis in the evolution of eukaryotic chromosomes. Proceedings of the National Academy of Sciences, 89, 1075–1079.
10.1073/pnas.89.3.1075
CAS PubMed Web of Science® Google Scholar
Kües U, Martin F (2011) On the road to understanding truffles in the underground. Fungal genetics and biology, 48, 555–560.
10.1016/j.fgb.2011.02.002
PubMed Web of Science® Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England), 25, 1754–1760.
10.1093/bioinformatics/btp324
CAS PubMed Web of Science® Google Scholar
Li H, Handsaker B, Wysoker A et al. (2009) The sequence alignment/map format and SAMtools. Bioinformatics (Oxford, England), 25, 2078–2079.
10.1093/bioinformatics/btp352
CAS PubMed Web of Science® Google Scholar
Lin K, Limpens E, Zhang Z et al. (2014) Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus. PLoS Genetics, 10, e1004078.
10.1371/journal.pgen.1004078
CAS PubMed Web of Science® Google Scholar
Lisch D, Bennetzen JL (2011) Transposable element origins of epigenetic gene regulation. Current Opinion in Plant Biology, 14, 156–161.
10.1016/j.pbi.2011.01.003
CAS PubMed Web of Science® Google Scholar
Ma J, Bennetzen JL (2004) Rapid recent growth and divergence of rice nuclear genomes. Proceedings of the National Academy of Sciences of the United States of America, 101, 12404–12410.
10.1073/pnas.0403715101
CAS PubMed Web of Science® Google Scholar
Manning VA, Pandelova I, Dhillon B et al. (2013) Comparative genomics of a plant-pathogenic fungus, Pyrenophora tritici-repentis, reveals transduplication and the impact of repeat elements on pathogenicity and population divergence. G3: Genes|Genomes|Genetics, 3, 41–63.
10.1534/g3.112.004044
CAS Web of Science® Google Scholar
Martin F, Kohler A, Murat C et al. (2010) Périgord black truffle genome uncovers evolutionary origins and mechanisms of symbiosis. Nature, 464, 1033–1038.
10.1038/nature08867
CAS PubMed Web of Science® Google Scholar
Martin F, Cullen D, Hibbett D et al. (2011) Sequencing the fungal tree of life. New Phytologist, 190, 818–821.
10.1111/j.1469-8137.2011.03688.x
CAS PubMed Web of Science® Google Scholar
Montanini B, Chen PY, Morselli M et al. (2014) Non-exhaustive DNA methylation-mediated transposon silencing in the black truffle genome, a complex fungal genome with massive repeat element content. Genome biology, 15, 411.
10.1186/s13059-014-0411-5
PubMed Web of Science® Google Scholar
Murat C, Díez J, Luis P et al. (2004) Polymorphism at the ribosomal DNA ITS and its relation to postglacial re-colonization routes of the Perigord truffle Tuber melanosporum. New Phytologist, 164, 401–411.
10.1111/j.1469-8137.2004.01189.x
CAS Web of Science® Google Scholar
Murat C, Payen T, Petitpierre D, Labbé J (2013a) Repeated elements in filamentous fungi with a focus on wood-decay fungi. In: The Ecological Genomics of Fungi (ed. F Martin), pp. 21–40. John Wiley & Sons, Inc, Ames, Iowa.
10.1002/9781118735893.ch2
Google Scholar
Murat C, Rubini A, Riccioni C et al. (2013b) Fine-scale spatial genetic structure of the black truffle (Tuber melanosporum) investigated with neutral microsatellites and functional mating type genes. New Phytologist, 199, 176–187.
10.1111/nph.12264
CAS PubMed Web of Science® Google Scholar
Neafsey DE, Barker BM, Sharpton TJ et al. (2010) Population genomic sequencing of Coccidioides fungi reveals recent hybridization and transposon control. Genome Research, 20, 938–946.
10.1101/gr.103911.109
CAS PubMed Web of Science® Google Scholar
Nei M, Li WH (1979) Mathematical model for studying genetic variation in terms of restriction endonucleases. Proceedings of the National Academy of Sciences, 76, 5269–5273.
10.1073/pnas.76.10.5269
CAS PubMed Web of Science® Google Scholar
Ojeda DI, Dhillon B, Tsui CKM, Hamelin RC (2014) Single-nucleotide polymorphism discovery in Leptographium longiclavatum, a mountain pine beetle-associated symbiotic fungus, using whole-genome resequencing. Molecular Ecology Resources, 14, 401–410.
10.1111/1755-0998.12191
CAS PubMed Web of Science® Google Scholar
Olivier JM, Savignac JC, Sourzat P (2012) Truffe et Trufficulture. Fanlac, Périgueux.
Google Scholar
Persoons A, Morin E, Delaruelle C et al. (2014) Patterns of genomic variation in the poplar rust fungus Melampsora larici-populina identify pathogenesis-related factors. Frontiers in Plant Science, 5, 450.
10.3389/fpls.2014.00450
PubMed Web of Science® Google Scholar
Riccioni C, Belfiori B, Rubini A et al. (2008) Tuber melanosporum outcrosses: analysis of the genetic diversity within and among its natural populations under this new scenario. New Phytologist, 180, 466–478.
10.1111/j.1469-8137.2008.02560.x
CAS PubMed Web of Science® Google Scholar
Robinson DF, Foulds LR (1981) Comparison of phylogenetic trees. Mathematical Biosciences, 53, 131–147.
10.1016/0025-5564(81)90043-2
Web of Science® Google Scholar
Rubini A, Belfiori B, Riccioni C et al. (2011) Isolation and characterization of MAT genes in the symbiotic ascomycete Tuber melanosporum. New phytologist, 189, 710–722.
10.1111/j.1469-8137.2010.03492.x
CAS PubMed Web of Science® Google Scholar
Schliep KP (2011) Phangorn: phylogenetic analysis in R. Bioinformatics, 27, 592–593.
10.1093/bioinformatics/btq706
CAS PubMed Web of Science® Google Scholar
Selker EU, Cambareri EB, Jensen BC, Haack KR (1987) Rearrangement of duplicated DNA in specialized cells of Neurospora. Cell, 51, 741–752.
10.1016/0092-8674(87)90097-3
CAS PubMed Web of Science® Google Scholar
Smith SE, Read DJ (2010) Mycorrhizal Symbiosis. Academic Press, London.
Google Scholar
Tajima F (1989) Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics, 123, 585–595.
10.1111/j.1365-294X.2005.02683.x
CAS PubMed Web of Science® Google Scholar
Tisserant E, Da Silva C, Kohler A et al. (2011) Deep RNA sequencing improved the structural annotation of the Tuber melanosporum transcriptome. New Phytologist, 189, 883–891.
10.1111/j.1469-8137.2010.03597.x
CAS PubMed Web of Science® Google Scholar
Van Andel TH, Tzedakis PC (1996) Palaeolithic landscapes of Europe and environs, 150,000–25,000 years ago: an overview. Quaternary Science Reviews, 15, 481–500.
Google Scholar
Watterson GA (1975) On the number of segregating sites in genetical models without recombination. Theoretical Population Biology, 7, 256–276.
10.1016/0040-5809(75)90020-9
CAS PubMed Web of Science® Google Scholar
Weedall GD, Conway DJ (2010) Detecting signatures of balancing selection to identify targets of anti-parasite immunity. Trends in Parasitology, 26, 363–369.
10.1016/j.pt.2010.04.002
CAS PubMed Web of Science® Google Scholar
Wicker T, Oberhaensli S, Parlange F et al. (2013) The wheat powdery mildew genome shows the unique evolution of an obligate biotroph. Nature Genetics, 45, 1092–1096.
10.1038/ng.2704
CAS PubMed Web of Science® Google Scholar
Zhan B, Fadista J, Thomsen B et al. (2011) Global assessment of genomic variation in cattle by genome resequencing and high-throughput genotyping. BMC Genomics, 12, 557.
10.1186/1471-2164-12-557
CAS PubMed Web of Science® Google Scholar

F.M. and C.M. designed the project. C.M. extracted the DNA, and C.M., T.P., A.G. and E.M. contributed to the bioinformatics analyses. T.P. and S.D.M. performed the selection analyses. C.M., T.P., S.D.M. and F.M. wrote the manuscript.

Data accessibility

Data sequences: the raw sequence data generated in this study were deposited in the NCBI short reads archive under Accession No SRP044130.

SNPs data: the gff file with all 442 326 SNP resources and the 60 507 SNPs free of selection can be downloaded in DRYAD (doi:10.5061/dryad.9gk52) and in our institution website following this link (http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=anno).

Phylogenetic trees: the nexus and xml files used as input in BEAST as well as the newick files corresponding to the Bayesian and the maximum-likelihood phylogenetic reconstruction are available in DRYAD (doi:10.5061/dryad.9gk52).

Bioinformatic scripts: the python scripts used in this study are available at the INRA Tuber genome portal using the following link (http://mycor.nancy.inra.fr/IMGC/TuberGenome/download.php?select=anno).

Citing Literature

Volume15, Issue5

September 2015

Pages 1243-1255

A survey of genome-wide single nucleotide polymorphisms through genome resequencing in the Périgord black truffle (Tuber melanosporum Vittad.)

Abstract

Introduction