Volume 30, Issue 16 pp. 3930-3947
ORIGINAL ARTICLE
Full Access

The demographic history of Castanea sativa Mill. in southwest Europe: A natural population structure modified by translocations

Josefa Fernández-López

Corresponding Author

Josefa Fernández-López

Forest Research Centre Lourizán, AGACAL, Xunta de Galicia, Pontevedra, Spain

Correspondence

Josefa Fernández-López, Forest Research Centre Lourizán, AGACAL, Xunta de Galicia, Pontevedra 36143, Spain.

Email: [email protected]

Search for more papers by this author
Javier Fernández-Cruz

Javier Fernández-Cruz

Forest Research Centre Lourizán, AGACAL, Xunta de Galicia, Pontevedra, Spain

Search for more papers by this author
Beatriz Míguez-Soto

Beatriz Míguez-Soto

Forest Research Centre Lourizán, AGACAL, Xunta de Galicia, Pontevedra, Spain

Search for more papers by this author
First published: 08 June 2021
Citations: 14

Abstract

In domesticated species, translocation of materials can alter natural demographic patterns; this may have occurred in Castanea sativa (European chestnut), a species conserved in several refuges, with a long domestication history for nut production. Bayesian analysis of population genetic structure in marginal areas and in the centre of C. sativa range, considering spatial information and making corrections for unbalanced size, allowed visualization of a genetic structure that was subsequently analysed by approximate Bayesian computation to assess its natural demographic history and test the origin of some hypothetical translocated germplasm. We obtained evidence of C. sativa population contraction during the earliest Pleistocene, resulting in a split into eastern (Greek) and western (Italian and Iberian) populations. The northern Iberian population, in the Eurosiberian area, is one of the vestiges that remained after the contraction that split the global Tertiary population. A secondary encounter, occurred from Middle to Upper Pleistocene, which explains the hybrid origin of the Western Mediterranean population present in Italy and in the centre and south of the Iberian Peninsula. It has been demonstrated that a germplasm translocation from Italy to the Central Iberian Range may have occurred. Recent translocations have hybridized with the local northern Iberian population, producing naturalized populations with high diversity. The populations of C. sativa in southwestern Europe have a genetic structure compatible with a natural origin, in which signs of population contractions and expansions caused by climatic oscillations since the Late Miocene have been imprinted.

1 INTRODUCTION

The current genetic population structure of species conserved in southern European refugia is often a mosaic of populations that differentiated during periods of maximum glaciation and of hybrid populations formed during interglacial periods, after their expansion from refuges and secondary encounters (Hewitt, 2000). These processes may have been repeated several times during the Pleistocene (Nieto-Feliner, 2014). In addition, in slowly evolving tree species, the grid of populations fragmented from large and connected ones could contain traces of the old structure that existed during the Pliocene, before the initiation of Pleistocene cooling cycles (Magri et al., 2008; Petit et al., 2005) and the Mediterranean climate (Suc, 1984). Population structures of natural origin may be altered by the effect of germplasm translocations: the human-mediated transfer of living organisms from one place to another (IUCN, 1987). Within-range translocations have been frequent in domesticated fruit trees, giving rise to populations formed by individuals of different origins that hybridize (Besnard et al., 2018; Miller & Gross, 2011), originating heterozygote varieties with locally adapted genes (Zohary & Spiegel-Roy, 1975). An increase in diversity within stands has also been observed in planted forest trees that received a certain proportion of translocated germplasm (Wagner et al., 2015). The European chestnut (Castanea sativa Mill.), the only species of the genus present in Europe, has been managed and domesticated since ancient times. It is currently unclear whether the native Iberian populations have become extinct and current populations descent from historical introduction (Rodríguez-Sánchez et al., 2010), whether and to what extent the current population genetic structure has been impacted by recent germplasm translocation (Fineschi et al., 2000; Mattioni et al., 2013), or even whether the current genetic constitution mainly results from natural demographic history (Fernández-López & Monteagudo, 2010; Mattioni et al., 2008). The main argument in support of the destruction of natural chestnut wild resources is the limited geographic structure identified with cytoplasmic DNA (Fineschi et al., 2000); however, such a pattern also occurs in other nonmanaged, slowly evolving, long-lived species, for example, Taxus baccata (Mayol et al., 2015).

The genus Castanea, as well as other Arctotertiary species, already existed in the Iberian Peninsula during the late Eocene (Barrón et al., 2010). Subsequently, during the Oligocene, a period of subtropical climate, Castanea had a wider distribution than its current one (Lang et al., 2007). Fossil evidence confirms the persistence of the genus in the Iberian Peninsula during the Miocene and Pliocene (Barrón et al., 2010), since the beginning of the Pleistocene (Gonzalez-Samperiz et al., 2010; Postigo-Mijarra et al., 2010; Ribeiro et al., 2019) and throughout the Late glacial and the Holocene (López-Sáez et al., 2016, 2020; Muñoz Sobrino et al., 2004; Muñoz-Sobrino et al., 2001). Several refuges were identified in suitable habitats in a large part of the Iberian Peninsula, southwest France and the Italian Peninsula during the last glacial maximum (LGM; Krebs et al., 2019).

Castanea sativa is a temperate and thermophilous tree with a discontinuous range that lives in acid soils. Its southern distribution is limited by drought because it requires an annual precipitation of at least 500–600 mm. The species' current range extends from the northwest Iberian Peninsula to the Caucasus, with its southern limit lying in the forest massifs of Algeria and Tunisia (Camus, 1929). The species shows an east to west pattern of decreasing genetic diversity (Fernández-Cruz & Fernández-López, 2016; Mattioni et al., 2013, 2017). This pattern, which is frequent in many other species, is imputed to the effect of very cold, drought-prone conditions in the west with respect to the east during cold intervals of the Pleistocene (Fady & Conord, 2010). Time of bud burst, an important adaptive trait in temperate deciduous species, shows clear geographic patterns of increasing time for flushing from east to west and from south to north (Fernández-López, Zas, Blanco-Silva, et al., 2005; Fernández-López, Zas, Diaz, et al., 2005). In the Iberian Peninsula, the main area of C. sativa is in the humid north-northwest region, but scattered patches are found in central and southern Iberia, in higher drought conditions (Figure 1; Fernández-López, Zas, Blanco-Silva, et al., 2005; Míguez-Soto et al., 2019; Míguez-Soto & Fernández-López, 2015). During the warm, humid conditions of the mid-Holocene, the species expanded its area to central and southern Iberia, but subsequently declined, leaving patches that lost connectivity because of climate change (Roces-Díaz et al., 2018) and increased anthropic activity (López-Sáez et al., 2014). Castanea sativa possesses highly differentiated subpopulations that have been affected by genetic drift (Fernández-López & Monteagudo, 2010) because of low gene flow via seeds and pollen (Fineschi et al., 2000), in a species with predominant entomogamous pollination (De Oliveira et al., 2001; Manino et al., 1991).

Details are in the caption following the image
Sampled stands in north, central and south Iberia, Italy and Greece. The North Iberian population includes Atlantic, Western and Eastern Cantabrian elements. The Western Mediterranean population, which represents the central area of Castanea sativa, includes the central and south Iberian and Italian populations. The Eastern Mediterranean population is represented by Greek populations. Chestnut distribution is shown in dark grey, modified for this work from Fernandez-Lopez and Alia (2003)

For the management and domestication of C. sativa to have modified the natural population genetic structure, some of the following conditions would have to have been met: elimination of natural autochthonous populations; intensive selection that caused a high genetic distance between wild and cultivated chestnuts; and massive germplasm translocation from distant populations.

The elimination of populations of forest tree species has occurred throughout western Europe as a result of the development of livestock and agriculture at least since the Neolithic. In general, current forests developed from populations that remained alive during that period, reproducing natural patterns despite a drastic reduction in diversity (Bradshaw, 2004). Castanea sativa has probably been markedly affected by this process.

Many of the current C. sativa stands are managed populations, either coppices (for wood) or orchards (for nuts). For coppices, there is almost no reason to talk about selection after stooling during successive rotations; therefore, we can speak only of sylviculture. In contrast, selection operates in orchards, especially when reproduction is by grafting. However, although there are certain productive and quality characteristics that justify the selection of grafted individuals, the rate of evolution in domestication has been low because of the long generation time and the use of vegetative propagation by grafts, which increase the generation time in most fruit trees (Miller & Gross, 2011; Zohary & Spiegel-Roy, 1975). Orchards contribute to the modification of the natural population genetic structure of chestnut through their naturalization, mainly via seed dispersal in the wild, because many varieties are male sterile.

Some modification of the natural population structure is supported by evidence of within range translocations between distant populations. For example, 33% of grafted orchards in the Italian and Iberian peninsulas contain germplasm of nonlocal origin (Mattioni et al., 2008). Another example is the presence of a germplasm assigned to the western Mediterranean population in the northwest Iberian Peninsula, in old grafted varieties and also in naturalized stands (Fernández-Cruz & Fernández-López, 2016; Fernández-López & Fernández-Cruz, 2015). In addition, the low differentiation of Italian and central Iberian stands (Fernández-Cruz & Fernández-López, 2016) could be due to translocations.

For these reasons, total alteration of the evolved population genetic structure is unlikely; thus, it is most probable that most genepools identified in phenotypic and population genetic research on current populations are of natural origin (Fernández-Cruz & Fernández-López, 2012, 2016; Fernández-López & Monteagudo, 2010; Mattioni et al., 2013). However, changes can be expected in areas where germplasm was translocated.

One of the germinal ideas of this document has been the desire to understand the origin of two genepools identified in the Iberian Peninsula. In previous analyses they were assigned to the main population present in Italy, and, in consequence, could be affected by germplasm translocations. One of these populations, in central Iberian, located in the Central Iberian and Toledo Mountain Ranges, is made up of several extensions of coppices occupying few thousand hectares, that are genetically very close to a stand in northern Italy (Fernández-Cruz & Fernández-López, 2016). The other population is in Courel, which is in the northwestern Iberian inland. Courel orchards, occupy 2748 hectares (Alonso et al., 2020) with majestic trees managed for nut and wood production, containing a high number of traditional varieties. Some of them may have been translocated from somewhere in the western Mediterranean and appear to be at the basis of an important part of the traditional chestnut varieties cultivated in the northwestern Iberian Peninsula (Fernández-López & Fernández-Cruz, 2015).

In this study, we carried out several analyses to elucidate the natural demographic history of southwestern European C. sativa populations to discover the relationships between different ancestral populations identified in the analysis of population structure. We considered that most sampled gene pools are the product of natural processes. Herein, we assumed that at least some of the current chestnut populations in the Mediterranean peninsulas were there for much of their evolutionary history and came from more extensive, continuous and connected populations that were reduced to their current locations by successive contractions and expansions during glacial and interglacial intervals through the Miocene and Pleistocene. The second issue that we addressed was to ascertain whether central Iberian and Courel gene pools present in the Iberian Peninsula are the result of translocation, as well as tentatively identify their origin.

We analysed microsatellite data of C. sativa populations through systematic Bayesian clustering to detect the existence of hidden population structures that have not previously been identified. Criteria for the identification of translocated germplasm were established and genetic parameters of within-stand diversity were used to identify footprints of translocations. The results were used to establish plausible scenarios to interpret the natural evolutionary history of this species in southwestern Europe, and ascertain the origin of some possible germplasm translocations using approximate Bayesian computation (ABC).

2 MATERIALS AND METHODS

2.1 Plant materials

The existing evidence on alteration of the natural structure of C. sativa populations in its southwestern range (Mattioni et al., 2008) caused by the cultivation in grafted orchards led us to select populations that were as natural as possible. In most cases, sampling was conducted in high forests and coppices; grafted orchards were avoided when possible. In the northwestern Iberian Peninsula, most sampling was conducted not far from the coast, in an area in which several refuges were identified (Krebs et al., 2019) and in which the influence of domestication was less important (Fernández-Cruz & Fernández-López, 2016). Seeds of Japanese chestnut (Castanea crenata Sieb. et Zucc.) were introduced in the north of Spain between 1917 and 1940 due to its tolerance to the ink disease, caused by several species of Phytophthora (Fernández-López, 2011). Interspecific hybrids between the European and Japanese chestnuts were used in regular plantations in different geographic areas affected by ink disease during last decades. These hybrids were also avoided during sampling based on morphological and phenological evidence (Fernández-Cruz & Fernández-López, 2012; Fernández-López, 2011). The original data set of samples analysed in this research were checked for the presence of interspecific hybrids; 2.4% of genotyped samples, identified as C. crenata × C. sativa hybrids, mainly in the North and Northwest Iberian stands, were removed from the data set (Fernández-Cruz & Fernández-López, 2016).

Twenty-nine wild stands were sampled: 25 representing the most important areas of the Iberian distribution of C. sativa, two in Italy, and two in Greece (Figure 1). In the Eurosiberian area of the Iberian Peninsula, 14 stands were sampled: five stands (stands 1 to 5) along the Atlantic Galician coastal areas and nine (stands 6 to 14) from the northwestern corner of Galicia to the Basque Country. In inland Galicia and León, a region of orchards managed for nuts or for both nut and wood production, six stands were sampled (stands 15 to 20). Two of these stands (18, 19) were sampled in Serra do Courel. In the Central System Range and Toledo Mountain Range, four stands were sampled (stands 21 to 24). Smaller patches of chestnut were present in the south (Andalucía), where only one stand (25) was sampled. Italian stands (26 and 27) were in Piedmont and Sicily, and the Greek stands (28 and 29) were in Macedonia. The total number of trees was 715, with 16 to 29 (mean 24.65) trees per stand. The names of stands, their geographic information, silviculture and number of trees per stand are summarized in Table S1.

2.2 Microsatellites

The multilocus genotypes are based on eight loci previously used for population-genetic description of the same samples (Fernández-Cruz & Fernández-López, 2016). There were no missing values, nor null allele in the selected set. The microsatellites and their linkage groups in parentheses (Barreneche et al., 2004) were: CsCAT14 (2); CsCAT16 (6); CsCAT41 (8) (Marinoni et al., 2003); EMCs2 (6); EMCs14 (5); EMCs15 (9) (Buck et al., 2003); QpZAG36 (2) and QpZAG110 (8) (Steinkellner et al., 1997). Locus CsCAT3 was not used in this study, following the recommendation of Selkoe and Toonen (2006) of not using microsatellites with a high mutation rate for demographic studies of distant events. DNA extraction methods, PCR amplification conditions and capillary electrophoresis were as described in Fernández-Cruz and Fernández-López (2012).

2.3 Data analyses

2.3.1 Genetic diversity

Genetic parameters were calculated for each locus and each stand with GENODIVE 2.0b25 (Meirmans & Van Tienderen, 2004). The selected parameters included the number of alleles (Na), effective number of alleles (Ne), and observed (Ho) and expected (He) heterozygosities. Allelic richness (Ar) was calculated with HP-RARE (Kalinowski, 2005) to compare the numbers of alleles in populations displaying different sizes. Additionally, the inbreeding coefficient (Fis) was calculated from an analysis of molecular variance (AMOVA) (Excoffier et al., 1992; Michalakis & Excoffier, 1996) to check whether populations were in Hardy–Weinberg equilibrium.

2.3.2 Identification of ancestral populations

Genotypes of 29 wild chestnut stands were analysed using the Bayesian method implemented in Structure v. 2.3.4, which uses allelic frequencies for the identification of K unknown populations (Hubisz et al., 2009; Pritchard et al., 2000). Several authors demonstrated that Structure underestimates the number of real populations and that sampled populations represented by small numbers of individuals merge with other, better-represented, populations (Duminil et al., 2006; Kalinowski, 2011; Puechmaille, 2016; Vähä & Primmer, 2006). However, Wang (2017) showed that these problems originate from the use of the default initial ALFA value (the Dirichlet parameter for degree of admixture; Pritchard et al., 2009), and recommended the use of an initial ALFA value of 1/K (where K is the number of populations) and the uncorrelated allele frequency model for imbalance correction.

Four different models were analysed, all using the admixture option (Table S2). Two models for allele frequencies were used, noncorrelated and correlated allele frequencies among populations, the second of which is used as the default in most population structure analyses. The two admixture models were analysed with and without the option Locprior, which uses sampling location information in cases of low numbers of markers, small sample sizes and close relationships, assuming that individuals from the same sampling location usually belong to the same population (Hubisz et al., 2009).

For each K value from 1 to 16, 20 independent Markov chain Monte Carlo runs of 200,000 repetitions after an initial burning of 50,000 iterations were run. The results of replicate runs were analysed with Clumpak to identify sets of similar runs using the option Full Search (Jakobsson & Rosenberg, 2007; Kopelman et al., 2015) to compare the solutions of the different models. The most probable number of K hidden populations was identified using the median values of logarithm of probability of data (Pritchard et al., 2000) to calculate Prob (K) and the statistic ∆K (Evanno et al., 2005). The statistic ∆K detects the highest hierarchical level of structure in the sampled populations, whereas the maximum Prob (K) identifies subdivisions of the main structure.

Several genetic parameters were obtained from the Structure run with the least negative log-likelihood value. First, allelic frequencies of each population for K with the highest ∆K were applied to identify ancestry-informative markers exhibiting different alleles at very high frequencies in parental populations (Pritchard et al., 2009). Second, trees constructed with the nucleotide distance among clusters (D) by applying the neighbour-joining algorithm to the matrix of allele frequency divergence among clusters (Felsenstein, 2005; Saitou & Nei, 1987), were used to obtain a visual representation of the relationships among identified ancestral populations. To describe the relationships between identified ancestral populations and within each one we used the values of FST. Analysis of molecular variance and F statistics were performed with Genalex 6.5 (Peakall & Smouse, 2012).

2.3.3 Identification of translocated germplasm

Fixation index (FST) values among the populations were used for identification of translocated germplasm considering that: (1) a population with FST > 0.10 with other surrounding, more abundant populations could be a translocated population; and (2) two very distant populations with FST values < 0.05 could be related by a germplasm translocation. Results of Bayesian analysis were also used to identify alien germplasm that could be a product of within-range translocations and admixed with the local gene pool.

2.4 Demographic history

Approximate Bayesian computation (ABC) implemented in Diyabc 2.1.0 (Cornuet et al., 2014) was used to analyse several projects to investigate the main questions that arose from the Structure results. The map of stands and their grouping into populations separated by Structure, as used in the different projects, is provided in Figure 1 and the sizes of populations are listed in Table S3. In all projects and scenarios, an ancestral population suffered a size variation. Several projects were defined for different purposes. The objectives of Projects 1 to 3 were to resolve the natural demographic history of chestnut populations in the southwestern range of the species, whereas projects 4 and 5 aimed to obtain information about the tentative origins of two Iberian Peninsula populations that could be the result of translocations. Each project consisted of three scenarios, each one with its historical model.

The aim of Project 1 was to unravel the relationship among the three identified North Iberian subpopulations. In this project, we used samples of the Atlantic Galician shore, western Cantabrian and eastern Cantabrian populations (Figure 2a). In scenario 1, the three populations diverged at time t1 from one unique ancestral population. In scenario 2, the Atlantic and eastern Cantabrian gene pools were split t2 generations ago from one ancestral population; afterwards, the western Cantabrian population originated at time t1 in one secondary encounter between the Atlantic and eastern Cantabrian populations. Scenario 3 is an east–west colonization scenario.

Details are in the caption following the image
Demographic scenarios used for Approximate Bayesian Computation of Castanea sativa populations. (a) Project P1: scenarios to unravel the relationship among the three North Iberian subpopulations, Atlantic, Western Cantabrian and Eastern Cantabrian. (b) Project P2: scenarios to identify the relationship between the main C. sativa populations identified in its southwestern range, North Iberian (Atlantic, Western and Eastern Cantabrian), Western Mediterranean (split into central Iberian and Italian + south Iberian) and Eastern Mediterranean. (c) Project P3: same objective and gene pools than project P2, but scenarios 1 and 3 differ from Project 2 in the positions of the central Iberian and Italian populations being interchanged. (d) Project P4: scenarios to test a translocation of germplasm from Italy to central Iberia using the best scenario of project P2 as scenario 1 and in scenarios 2 and 3 the central Iberian population was translocated from Italy between 100–3000 and 100–1000 years BP, respectively. (e) Project P5: scenarios to estimate the origin of the germplasm present in Courel in northwestern inland Iberian. NAtl, NCw, NCe, NCIb, NSIb+It, NMe, NCourel, current effective population size of the Atlantic, Western Cantabric, Eastern Cantabric, central Iberian, South Iberia + Italian, Eastern Mediterranean and Courel populations; NA1, NA2, ancestral effective population size before and after a bottleneck; t#, number of generations from events; in Project 4 t1a and t1b are the number of generations since the translocation from Italy to central Iberian; r#, admixture rates among the corresponding populations

Projects 2 and 3 were conducted to attempt to identify natural demographic events to explain the relationship between the main C. sativa populations identified in its southwestern range. In both projects, we used samples of the north Iberian, western Mediterranean and eastern Mediterranean populations. The western Mediterranean population was split for further analysis into two samples: central Iberian and Italian + south Iberian. In scenarios 1 and 3 an ancestral population split t3 generations ago, as a result of contraction, into a western (the north Iberian) population and the eastern Mediterranean population. In Project 2 (Figure 2b), scenario 1, the central Iberian population was formed t2 generations ago by a secondary encounter of the north Iberian and eastern Mediterranean populations; at t1 there was a new gene flow from the east producing the current Italian and south Iberian populations. Scenario 2 is an east–west colonization scenario. Scenario 3 differs from scenario 1 in that the central Iberian population came directly from the eastern Mediterranean. In Project 3 (Figure 2c), scenarios 1 and 3 differ from Project 2 in the positions of the central Iberian and Italian populations being interchanged.

Project 4 (Figure 2d) tried to explain the low differentiation between populations in the centre of the Iberian Peninsula and Italy through a germplasm translocation using the same gene pools than projects P2 and P3. For this analysis, the best scenario identified in Project 3, a scenario of natural evolution, was compared with two scenarios in which the central Iberian population was formed after a germplasm translocation from Italy and an admixture with local Northern Iberian populations. The number of generations from admixture was 1–30 for scenario 2 (for translocation during the Roman Empire) and 1–10 generations for scenario 3 (for a translocation during Middle Ages).

Project 5 (Figure 2e) was conducted to estimate the origin of the germplasm present in Courel (stands 18 and 19), in the northwestern Iberian mountains. The best scenario identified in Project 2 was transformed into three scenarios: scenario 1, in which Courel germplasm was translocated from central Iberia; in scenario 2, Courel germplasm was translocated from Italy; and in scenario 3, it was a population that is an old admixture between the north and central Iberian populations.

To avoid high disequilibrium among sizes of population samples we created three data sets for each simulation of Projects P2 to P5. They differed in that the north Iberian population used was represented by the Atlantic, western Cantabrian or eastern Cantabrian population in simulations a, b, and c, respectively (Table S3). The number of samples for each simulation varied between 300 and 422.

The range of values of historical parameters and their distribution are shown in Table S4: priors of the effective population sizes of current populations were 10–10,000; those of the effective population sizes of ancestral populations and time frames were 10–100,000 (except t1 in Project 4: 1–30) and ti ≤ ti+1; those of the admixture rate were 0.001–0.999. All the parameters were with uniform prior distributions. The genetic model was common for all projects. A stepwise mutation model was used with default values except for the mean mutation rate, the minimum of which was enlarged to 10−5 mutations per generation. The simulated data sets were summarized by five parameters: the mean numbers of alleles of one sample and two samples, the mean genetic diversity of one sample and two samples, and pairwise FST. A pre-evaluation of scenarios and established priors was performed after 105 simulations for each scenario. Once the simulated combinations worked well, 106 simulations were run for each scenario. Competing scenarios were compared with the logistic regression method using 1% of the simulated data sets. The scenario with highest posterior probability (PP) was chosen as the best supported. The reliability of the chosen scenario was evaluated by computing confidence in the scenario choice and performing model checking with the complete set of parameters proposed for microsatellite loci in DIYABC. Posterior distribution of the parameters for the selected scenario was estimated with logit–transformed data, and biases of the parameters were estimated using the relative median of the absolute error (RMAE).

As Diyabc estimates time from events in number of generations, we used a generation time of 100 years for the transformation to absolute time, as for other long-lived species such as Juglans regia (Pollegioni et al., 2017) and T. baccata (Mayol et al., 2015). To estimate the geological period in which different historical events may have occurred, the following standard intervals of time were used: Late Miocene 11.6–5.3 million years ago (Ma), Pliocene 5.3–2.6 Ma, Early Pleistocene 2.6–0.8 Ma; Middle Pleistocene, 800–125 ka; upper Pleistocene, 125–11.5 ka; Last Interglacial, 120–140 ka; last general glacial maximum (LGM), 21–17 ka; Late glacial period (17–11.5 ka) and Early Holocene, 11.5–8.0 ka.

3 RESULTS

3.1 Genetic diversity

In the total sample of 751 individuals genotyped with eight loci 74 alleles were identified, with the number of alleles for each locus varying between three and 18 (Table S5a).

The Greek and Italian stands, several Iberian stands in northwest inland region (15, 16 and 20) and stand 24 in the Toledo Mountain Range have the highest allelic richness (Table S5b0, Figure S1b). The latter four populations also had the highest gene diversity (Figure S1d). Other north-western inland stands (15–20) showed high observed heterozygosity (Figure S1c). Several northern Iberian stands (9, 11,18 and 19) showed negative Fis values (Figure S1e).

3.2 Population genetic structure

Selection of the models for the description of population genetic structure was performed considering the repetitiveness and correlations between the results of the different runs of each model for each K value (Table S6a, b) and the ∆K and Prob (K) values of each model (Figure S2). For K = 2, the four models produced a single cluster with very high mean similarity score (MSS) of 0.97–0.99. For K = 3 the same applied to models 1 and 2, whereas for models 3 and 4 the major cluster occurred in 17 of 20 runs with high MSS values. For higher values of K, the number of times the major cluster was repeated was smaller. The highest level of population structure, identified by ∆K, produced higher peaks in models 2 and 4, with independent allelic frequencies, than in models 1 and 3, with correlated allelic frequencies. In models 1 and 2 the highest values of ∆K occurred for K = 3. With models 3 and 4, the highest values of ∆K were produced for K = 2. For subdivisions of the main structure, identified by Prob (K), the greatest probability occurred for model 2 at K = 8 with probability 1, whereas in model 1 the highest probability occurred for K = 11, with a value of 0.48. Models 3 and 4 yielded a probability of 1 for K = 13.

3.3 Description of the highest hierarchical level of population structure

Model 2 for K = 2 (Figure 3a) gives information similar to model 4 (Figure 3b), but the hybrid character of the Italian and central and south Iberian populations is less obvious than in model 4. The cluster observed for K = 2 of model 4 identified the Northern Iberian population that is dominant in the north of the Iberian Peninsula (stands 1 to 14) and the Eastern Mediterranean population (stands 28 and 29; Figure 3b). The stands of the central and southern Iberian Peninsula (21 to 25), as well as the Italian stands (26 and 27), showed hybrid ancestry between the two previous populations, increasing the ancestry in the Eastern Mediterranean population towards the southern Iberian Peninsula and Italian stands (Figure 4). In the inland stands of northwest Iberian Peninsula (15, 16, 18, 19 and 20) the North Iberian component dominates, but they also have 20%–40% ancestry of the Eastern Mediterranean population.

Details are in the caption following the image
Ancestry in different populations (0 to 1 values) of 751 individuals of 29 stands obtained with admixture and noncorrelated frequencies models with Structure: without location information (model 2) for K = 2 ancestral populations (a), for K = 3 (c) and K = 8 (e); with location information (model 4) for K = 2 (b), K = 3 (d). SIb, south Iberian stand; It, Italian stands; Gr, Greek stands
Details are in the caption following the image
Pie diagrams of each stand representing the mean ancestry in each K = 2 ancestral population obtained with Structure model 4 (admixture, noncorrelated frequencies and spatial information). There is a west to east pattern of decreasing ancestry in the northwestern Iberian ancestral population

The clusters produced in models 2 and 4 for K = 3 are similar (Figure 3c,d). Hybrid individuals of model 4 for K = 2 appear here as a population that we called the Western Mediterranean population. This population is dominant in the stands of the central and southern Iberian Peninsula and Italian stands 21 to 27 (Figure 5a). Several stands appear as hybrids between the North Iberian and the Western Mediterranean populations, with dominance of the first, in several northern inland stands (15, 16 and 20) and dominance of the Western Mediterranean in stand 24. Two northwestern Iberian stands located in Courel (18, 19) are similar to central Iberian ones. For K = 3, model 4 (Figure 3d), a new type of hybrid between Eastern and Western Mediterranean populations is present, mainly in southern Iberia and Italy (Figure 3d). The dendrogram produced by Structure for K = 3, which is identical for models 2 and 4 (Figure 5b), suggests that the Western Mediterranean population is a hybrid between the North Iberian and Eastern Mediterranean populations.

Details are in the caption following the image
(a) Pie diagrams of each stand representing the mean ancestry in each K = 3 population obtained with Structure model 4 (admixture, noncorrelated frequencies and spatial information). (b) dendrogram produced by Structure for K = 3

The observation of allelic frequencies for K = 3 (Table S7) helps to explain why Structure identifies the Western Mediterranean population as hybrid between the Northern Iberian and Eastern Mediterranean ones. In both models 2 and 4, a total of 15 alleles of seven loci are almost diagnostic alleles for both parental populations and occur at mean or high frequency in the hybrid one. Eight of these alleles come from the Northern Iberian population, and seven from the Eastern Mediterranean.

3.4 Subdivisions of the highest hierarchical population structure

We show the results of model 2 for K = 8 (Figure 3e and Figure 6a). First, the Northern Iberian population was divided into three populations: Atlantic (stands 1–5); Western Cantabrian (stands 6–11) and Eastern Cantabrian (stands 12–14). Second, the Eastern Mediterranean population present in Greece (28 and 29) also occurs in southern Iberia and Italy. Third, the hybrid population for K = 3 (Figure 3b) is divided into: the Western Mediterranean population (WM1), which is dominant in the central Iberian Peninsula (stands 21–23) and Italy (26 and 27); the population present in Courel stands (18 and 19) (WM2), which occurs in several northwestern Iberian stands, mainly stand 15 and the central Iberian stand 24; a third population (WM3) that is very scarce and diffuse in stands 16, 18 and 24; and the Southern Iberian Mediterranean (stand 25) (WM4).

Details are in the caption following the image
(a) Pie diagrams of each stand representing the mean ancestry in each K = 8 ancestral population obtained with Structure model 2 (admixture, noncorrelated frequencies, without spatial information). White sectors within some pies are the sum of population percentages <5%. Mountain ranges isolating North Iberia are: 1, Cantabrian Mountain Range; 2, Galician-Duero Mountains (Cabrera, Segundera and Montes de León); 3, Courel Range; 4, Ourense Central Range. The position of the main North Iberian barriers during cold Pleistocene periods which separated the three northern refuges: arrow 1, mountains at more than 500 m of altitude that separate Terra de Soneira, the Xallas and Bergantiños basins from the valleys that flow over the Muros and Noya estuary; arrow 2, Picos de Europa. (b) Dendrogram produced by Structure for K = 8 ancestral populations. WM1–4 are subpopulations of the Western Mediterranean population

The dendrogram of model 2 for K = 8 (Figure 6b) indicates the geographical organization of these identified populations. The Eastern Mediterranean population is at one extreme; at the other are the three Northern Iberian populations: Atlantic, Western Cantabrian and Eastern Cantabrian. The Southern Iberian population (WM4) lies closest to the Eastern Mediterranean, followed by the Western Mediterranean (WM1), and then by population WM3 diffused in several stands and finally population WM2, dominant in Courel, which is the most similar to the northern Iberian subpopulations.

The FST values among stands within identified ancestral populations and among stands of different population combinations are provided in Figure S3. The median FST values within demes were always between 0.05 and 0.09, whereas the median values among stands from different ancestral western populations were between 0.14 and 0.20. The greatest isolation occurred between the Eastern Mediterranean and Eastern Cantabrian populations (FST = 0.31). In contrast, Courel stands exhibited high FST values (0.16–0.20) with northern Iberian populations and the lowest FST values with the central Iberian population (0.13) and with the Italian population (0.15) (Figure S3). At the stand level (data not shown), the lowest FST of Courel was with Guadalupe in the centre of the Iberian Peninsula (FST = 0.09) and with Pellice in northern Italy (FST = 0.12). The low FST values between the Italian stand Pellice and the stands in the centre of the Iberian Peninsula, especially El Tiemblo and Hervás (FST 0.04–0.05) were striking. AMOVA results indicate that all identified populations were statistically different; there was even a difference between the genetic pools located in Italy and in the centre of the Iberian Peninsula (Table S8).

3.5 Identification of translocated germplasm

The reduced values of FST between the central Iberian and Italian stands could be because of translocation of germplasm from Italy. The same occurred for germplasm WM2, dominant in Courel (stands 18 and 19), the northwest inland region, more distant from other populations located in northern Iberia than from the Italian and central Iberian stands (Figure S3). This result leads us to question whether the germplasm in Courel was translocated.

To identify other stands with germplasm that could have been translocated, the Structure results of model 2 were used for K = 8 (Figure 3e). There were several stands composed of a mixture of various ancestral populations, especially stands 15, 16 and 20 in the inland northwest region and stand 24 in Toledo Mountain Range. The components of the mixtures were the northern Iberian population with: (i) the population dominant in Courel, WM2, in several northern Iberian stands, mainly stand 15; (ii) the Western Mediterranean population, WM1, in stand 20; (iii) population WM3 in stand 16; and (iv) WM2 and WM3 in stand 24. Courel stand 18 had a certain proportion of WM3 germplasm mixed with WM2. In general, the trees in the mixed stands were hybrids, but there were individuals with ancestry close to 90% in WM3 in stands 16, 18 and 19. Germplasms WM2 and WM3 admixed in the cited stands were translocated from other populations.

3.6 Demographic history of populations

The principal components used for pre-evaluation of scenarios analysed with Diyabc showed that the designed scenarios and their priors produced simulated data close to the observed data for all the projects (Figure S4).

3.6.1 Project 1: Demographic relationship among North Iberian gene pools

Scenario 1 had the highest posterior probability (PP) = 0.64, in contrast to PP = 0.06 and PP = 0.30 for scenarios 2 and 3 (Table 1). The historical parameters of scenario 1 (Table S9a) indicate that an ancestral population suffered a bottleneck during the late Miocene to early Pliocene. This result is common to all the analysed projects described below (Table 2; Table S9). The three demographic units identified in North Iberia became separated after the split of an ancestral continuous population during the Upper Pleistocene (86.2 ka).

TABLE 1. Posterior probabilities of scenarios (in bold with italics) estimated with a logistic approach using 1% closest simulated points and 95% confidence intervals (between brackets) for Diyabc simulations of projects P1 to P5
Simulation Scenario 1 Scenario 2 Scenario 3
Project 1: Relationship among North Iberian demographic units P1

0.6390

(0.620–0.659)

0.0578

(0.024–0.091)

0.3030

(0.266–0.340)

Project 2: Common demography for west Europe: the old hybrid population is Central Iberian P2a

0.8499

(0.801–0.898)

0.0731

(0.000–0.386)

0.0770

(0.000–0.339)

P2b

0.5443

(0.459–0.630)

0.3995

(0.286–0.513)

0.0562

(0.000–0.144)

P2c

0.8521

(0.804–0.900)

0.1077

(0.0000–0.400)

0.0402

(0.000–0.306)

Project 3: Common demography for west Europe: the first old hybrid population is South Iberia + Italian P3a

0.747

(0.713–0.781)

0.0675

(0.050–0.085)

0.1855

(0.154–0.217)

P3b

0.6922

(0.654–0.731)

0.1623

(0.130–0.195)

0.1455

(0.118–0.173)

P3c

0.7416

(0.706–0.777)

0.1328

(0.103–0.163)

0.1256

(0.102–0.149)

Project 4: Was there translocation from Italy to central Iberian? P4a

0.0001

(0.000–0.019)

0.7956

(0.721–0.870)

0.2043

(0.129–0.279)

P4b

0.0382

(0.000–0.161)

0.7155

(0.587–0.843)

0.246

(0.153–0.339)

P4c

0.0014

(0.000–0.029)

0.7567

(0.67–0.844)

0.2419

(0.155–0.329)

Project 5: What is the origin of the population Courel (northwestern inland Iberian)? P5a

0.1453

(0.000–0.492)

0.0229

(0.000–0.360)

0.8318

(0.763–0.901)

P5b

0.1354

(0.072–0.198)

0.0151

(0.005–0.025)

0.8496

(0.783–0.916)

P5c

0.1638

(0.105–0.223)

0.0141

(0.003–0.025)

0.8221

(0.761–0.883)

TABLE 2. Summary of the historical parameters estimated for the events evaluated in the most probable scenarios of projects 2 to 5
Time from events Description Project 2 (Scenario 1) Project 3 (Scenario 1) Project 4 (Scenario 2) Project 5 (Scenario 3) Period
t1 Time from gene flow from eastern Mediterranean to central Iberian to originate south Iberian and Italian populations 46–42 ka 59–49 ka Upper Pleistocene
Time from gene flow from south Iberian + Italian populations to originate the central Iberian 31–27 ka Late Pleistocene
t1a, t1b Translocation to central western Iberia from Italy (best simulation) 2.8–2 ka Iron Age - Roman Empire
t1a Time from admixture of north Iberia and central Iberia to originate Courel population 142–114 ka Last Interglacial
t2 Time from admixture of north Iberian population and eastern Mediterranean to originate: Middle to Upper Pleistocene
Central Iberian population 169–95 ka 276–223 ka
Italian and south Iberian 149–115 ka 117–76 ka
t3 Split of eastern and western populations of an ancestral continuous population 1.37–0.92 Ma 1.4–0.79 Ma 1.03–0.45 Ma 1.62–1.03 Ma Early to Middle Pleistocene
t4 Bottleneck of the ancestral population 6.00–4.70 Ma 6.30–5.60 Ma 5.70–4.90 Ma 5.23–4.91 Ma Late Miocene to Early Pliocene

Note

  • The parameters, estimated by Diyabc in number of generations, were transformed into years before present (1950 AD) assuming a generation time of 100 years.

3.6.2 Projects 2 and 3: A common demographic history for southwestern Europe

Scenario 1 was the best of the three simulations of Project 2, with probabilities PP (P2a, c) = 0.85 > PP (P2b) = 0.54; the PP values of the other two scenarios were much lower (Table 1). For Project 3 the results were similar, PP (P3a) = 0.75 > PP (P3c) = 0.74 > PP (P3b) = 0.69.

The first impression is that the results of the three simulations of Projects 2 and 3 (Table S9b, c) are similar, indicating that the three subpopulations identified in the northern Iberian Peninsula (Figure 1) act in the evaluated scenarios as if they were the same. The exchange of these populations with others included in the analysis completely modifies the results (data not shown). At t3 = 1.4–0.79 Ma (Early to Middle Pleistocene), the ancestral population was split into two populations as a result of contraction, leading to isolation of the Western population, represented by the North Iberian population and the Eastern Mediterranean population (Table 2, Figure 2b,c). At t2 = 169–95 ka (Middle to Upper Pleistocene), the two previously separated populations expanded and a secondary encounter took place that resulted in the origin of the current Western Mediterranean population, represented in Project 2 by the central Iberian population (admixture rate r2 = 0.72–0.76) and in Project 3 by the Italian and south Iberian population (admixture rate r2 = 0.47–0.55). At t1 = 46.1–42.5 ka in Project 2 and t1 = 31–27 ka in Project 3, during Upper Pleistocene, a second admixture occurred, with r1 = 0.73–0.77 in Project 2 and r1 = 0.40–0.49 in Project 3.

3.6.3 Project 4: Was there germplasm translocation from Italy to the Central Iberian population?

Scenario 2 (Figure 2d) showed high probabilities in all simulations with PP values of 0.79, 0.75 and 0.71 for simulations P4a, P4c and P4b (Table 1), indicating that translocation of germplasm from Italy to Central Iberia is highly probable compared to a natural origin of this last population. However, the Central Iberian population size exhibited unusually high bias (Figure S5b), indicating that there is some inaccuracy in the analysis. The confidence of scenario 2 choice is very high compared with scenario 1 (a natural origin) because type I and II errors are close to zero. Although the probability of scenario 2 (0.79) is much higher than that of scenario 3 (0.20), the discrimination between them is not clear because error values are high (Table S10). Therefore, we can only conclude that it is probable there was a translocation and, considering the results of the simulations of scenario 2 it occurred from 2.8 to 2 ka before 1950, i.e. among the Iberian Iron Age and the Roman Empire.

3.6.4 Project 5: What is the origin of the Courel gene pool?

Scenario 3 (Figure 2c) showed the highest probability in the three simulations, with PP values of 0.82–0.85 (Table 1), indicating that the Courel gene pool has its origin in a population produced by admixture of the North Iberian and Western Mediterranean populations, represented by the central Iberian gene pool, with an admixture rate between 0.70 and 0.81 with the North Iberian population, and formed between 142 and 114 ka, during the Last Interglacial (Table S9e).

In all developed projects and its simulations, the 95% confidence intervals for the different compared scenarios never overlapped (Table 1). The type I errors associated with choosing the best scenario (Table S10) had values of 0.18 for P1, 0.13–0.15 for P2, 0.30–0.32 for P3, 0 for P4 and 0.14–0.17 for P5. The type II error values were similar. Model checking of the best scenario (Table S10) was developed for 36 statistics for P1, 64 for P2, P3 and P4 and 100 for P5. In all cases the number of statistics with PP < 0.05 or PP > 0.95 was very low.

Performance of parameters estimated by RMAE (Figure S5) exhibited lowest values for the effective population size of current populations and for admixture rates, with values below 0.2, except for the Central Iberian population in project 4. The effective population sizes of the ancestral population before and after the bottleneck had RMAE values of 0.5–0.7, whereas the values for time from different events were usually under 0.4.

4 DISCUSSION

The four admixture models analysed with Structure provided some certainties in the definition of the structure of the C. sativa populations in the southwestern part of the distribution area of this species. First, the use of an initial value of ALPHA = 1/K (Wang, 2017) showed that the highest hierarchical level of population genetic structure was detected with K = 3, instead of K = 2, obtained using an initial value of Alpha = 1 (Fernández-Cruz & Fernández-López, 2016). This change allowed us to separate the Western Mediterranean population, present in Italy and the centre and south of the Iberian Peninsula from the Eastern Mediterranean population (Figure 3c,d), solving the imbalance in our data. Second, the uncorrelated allele frequency model (Wang, 2017) fits the data better than the correlated frequency model, used as the default in most populations structure analyses. This result is consistent with the high FST values found between the different western European demes (FST = 0.14–0.20), among eastern and western demes (FST = 0.19–0.31), or even within the same deme (FST = 0.05–0.09) (Figure S3). Thirdly, the best results of models without location information are justified by the evidence that there have been germplasm translocations. However, models using the option LOCPRIOR provide complementary information on gene flow among the eastern Mediterranean and Italian and southern Iberian populations (Figure 3d).

The results suggest that three relevant natural demographic events took place in the population genetic history of C. sativa in southwest Europe.

The first event was that an ancestral population suffered a bottleneck during the late Miocene or early Pliocene, decreasing its effective population size (Table 2; Table S8). Although the historical parameters for this event have very high bias (Figure S5), the date coincides with progressive cooling and seasonality of rainfall in this period in some areas of the Iberian Peninsula (Jiménez-Moreno et al., 2010). The North Iberian population appears to be a remnant of the Tertiary ancestral population that was fully identified after an intensive sampling in this area (Fernández-Cruz & Fernández-López, 2016; Fernández-López & Monteagudo, 2010). A possible refuge in Northwest Iberia has also been suggested in other studies of population genetics (Martín et al., 2012; Mattioni et al., 2017).

A second identified historical event is the split of the ancestral C. sativa population during the Early to Middle Pleistocene, which led to the separation of Eastern and Western European populations (Figure 3b; Table 2), a pattern also found in T. baccata and Laurus (Mayol et al., 2015; Rodríguez-Sánchez et al., 2009). This divergence was previously identified in C. sativa by Mattioni et al., (2013).

The third event was that the Western Mediterranean population, which is present in the central and southern Iberian Peninsula as well in Italy, is a hybrid population formed during expansions of Eastern and Western populations that occurred in interglacial intervals during the Middle and Upper Pleistocene. The ancestry in the Eastern Mediterranean population decreases from east to west (Figure 3b).

4.1 The north-northwestern Iberian chestnut population: A refuge of refuges

The Eurosiberian area of the Iberian Peninsula is a biogeographic region differentiated from the Iberian Mediterranean region (Blanco-Castro, 1997; Ramil-Rego et al., 1998), isolated by the Cantabrian and Galician Duero – Mountains (Figure 6a). Castanea sativa, as other demanding species, probably persisted at the end of the last glacial cycle in deep coastal valleys and lowlands at both sides of Galician-Duero Mountains and nearby Ancares Range where the temperature and precipitation were sufficient (Iriarte-Chiapusso et al., 2016; Muñoz-Sobrino et al., 1997). The North Iberian C. sativa population is one of the vestiges that remained after the contraction that split the global Tertiary population into Eastern and Western populations. Its situation at the extreme of the C. sativa range and its isolation by mountains and the Northern Iberian Meseta left this population with less influence of the expansions from other populations that occurred in the Pleistocene interglacial intervals. Probably other current Iberian and Italian populations that persisted during the Pleistocene are also remains of the Tertiary population, but changed more as a result of further gene flow from the east. The multiple intervals of very harsh climatic conditions in this area explain its loss of diversity. In North Iberia, other species such as white oaks (Petit, Csaikl, et al., 2002), Pinus pinaster (Bucci et al., 2007) and Erica arborea (Désamoré et al., 2011), show a phylogeography distinct from that of the rest of the Iberian Peninsula and abundant subtropical floral relics have persisted from the Oligocene, a period in which Castanea spp. was already present (in the As Pontes lignite mine; Barrón et al., 2010). The paleobotanical data highlighted this northern area of the Iberian Peninsula and southwestern France as an important chestnut refuge during the LGM (Krebs et al., 2019). A recent analysis of the consequences of ongoing climatic change indicated that, again, North and Northwest Iberian populations will be less affected by drought caused by climate change than populations in central and southern Iberia (Pérez-Girón et al., 2020).

We identified a contraction of the North Iberian population that resulted in the separation of the Atlantic, Western Cantabrian and Eastern Cantabrian populations in independent refuges during the Upper Pleistocene (Figure 6a). This subdivision occurred in two areas with unfavorable conditions for C. sativa during the glacial maximum. One barrier was Picos de Europa (Figure 6a). This barrier could have produced the isolation and subsequent differentiation of the eastern and western Cantabrian populations. There was another barrier in the northwestern corner of the Iberian Peninsula (Figure 6a) that separated the Atlantic and Western Cantabrian populations. In Figure 3 of Krebs et al., (2019) both areas exhibit discontinuities without pollen remains during the LGM.

The current Atlantic population, which is located in the protected coastal valleys of southern Galicia and probably in the contiguous north of Portugal until Porto, experienced one of the most optimal climatic conditions for a chestnut refuge during the LGM (Benito-Garzón et al., 2007; Roces-Díaz et al., 2018). Germplasm of this deme is currently present also in inland south Galicia (data not reported), a region where regular presence of Castanea pollen was identified during Late Glacial and Early Holocene (Iriarte-Chiapusso et al., 2016). The oldest pollen records in this area are 71 and 100 ka in Porto (Ribeiro et al., 2014, 2019).

Our population structure results indicate that in the inland northwest region, stands 15, 16, 17 and 20 were found in an area of secondary encounter between the Atlantic and Cantabrian populations (not analysed with ABC scenarios due to the presence of translocated germplasm) (Figure 6a). This result is supported by paleobotanical studies indicating that there has been recolonization of the most continental areas of the northwestern Iberia at least from coastal refuges during the Late Glacial and Early Holocene (Iriarte-Chiapusso et al., 2016; Muñoz-Sobrino et al., 1997, 2001, 2007).

4.2 Hybrid origin of the Western Mediterranean population

The natural hybrid origin of the current Italian population has been suggested as a possibility (Fernández-Cruz & Fernández-López, 2016; Fineschi et al., 2000; Mattioni et al., 2017) alternative to its hybrid origin after translocations and hybridization between local and translocated provenances (Fineschi et al., 2000; Mattioni et al., 2017). However, the hybrid origin of the central and southern Iberian populations has not been discussed previously.

It is possible that both hybrid populations (central and south Iberia and Italy) were part of the same refuge during the Middle Pleistocene, as for white oaks (Petit, Brewer, et al., 2002) and Erica arborea (Désamoré et al., 2011). Gene flow between the populations of both peninsulas could have taken place both in the north and through the Strait of Gibraltar (Figure 6b). The latter was a transmission bridge for other species such as Quercus suber (Magri et al., 2008) and Erica arborea (Désamoré et al., 2011).

4.3 Translocation from Italy to central Iberia

The possibility of a germplasm (WM1) translocation from Italy to central Iberia is supported by our ABC simulations of project 4, with the maximum probability among the Iron Age and the Roman Empire; other authors also suggested a translocation (Mattioni et al., 2013). The natural presence of C. sativa in this area predates roman times. C. sativa has been present there since at least the early Holocene (López-Sáez et al., 2020), with many known occurrences since the mid-Holocene (Carrión et al., 2007; Ruiz-Zapata et al., 2003) and there was increased abundance of Castanea pollen between 2550 and 1530 years BP in the area (Abel-Schaad & López-Sáez, 2013). Consequently, it seems that if germplasm translocation did occur, it happened in an area where C. sativa already existed.

4.4 The origin of the Courel gene pool

The germplasm WM2, which is dominant in Courel (stands 18 and 19), has great economic importance since a portion of the grafted northwestern varieties used for nut production belong to this gene pool, pure in some varieties or hybridized with the Atlantic or Western Cantabrian populations in others (Fernández-López & Fernández-Cruz, 2015). Most grafted varieties cultivated in Courel orchards have high ancestry in WM2 germplasm. Natural or naturalized individuals from northwest Iberian with high ancestry in WM2 were identified only in Courel or close to Courel (data not reported).

There are a number of reasons to presume that this germplasm was translocated to Courel. First, it is a population located in the Galician inner mountains that Bayesian computation assigns to the western Mediterranean group together with the germplasm existing in the centre of the Iberian Peninsula and in Italy (Figure 6a). Second, it displays higher FST values with the Atlantic and western Cantabrian demes than with stands present in Italy or the centre of the Iberian Peninsula (Figure S3). Third, in the northwest of the Iberian Peninsula, it is linked to the cultivation of a high number of grafted varieties and to their naturalization (nonreported data); and finally, WM2 germplasm is also present in Montes de Toledo (stand 24; Figure 6a) and could be there after a translocation.

Most trees in Courel orchards were planted 300 years ago with progenies of several old local trees of 500–1000 years which belong to the WM2 gene pool (data not shown). A medieval expansion of C. sativa in a zone close to Courel was dated to the 10th century by van Mourik (1986) coinciding with the age of the oldest existing trees in Courel orchards, which belong to the genetic group WM2 germplasm dominant there. The low genetic diversity of Courel stands (Figure S1; Table S5b) was probably caused by a foundation event through plantations with a few progenies collected from the oldest trees. In addition, the observed heterozygosity in Courel is among the highest of the whole data set, and the negative Fis values (Figure S1) indicate their foundation with a germplasm with sufficient genetic differences among them.

On the other hand, its autochthonous origin is supported by the following arguments: first, in the dendrogram for K = 8 populations, WM2 is the most similar to northern Iberian demes (Figure 6b); second, our Project 5 indicated that it could be a population formed during Last Interglatial by admixture of the North Iberian population with the western Mediterranean population populations and has a high admixture rate with the North Iberian population (r = 0.70–0.80) (Table 1; Table S9c); third, the medium-late flushing of Courel germplasm in a provenance test indicates a northern origin but its flushing is not as late as that of other northern populations stands 11 and 16 (Fernández-López, Zas, Blanco-Silva, et al., 2005; Míguez-Soto et al., 2019).

Paleobotanical research demonstrated recolonization in neighbouring mountain areas during Upper Pleistocene and Early Holocene (Muñoz-Sobrino et al., 1997), but the recolonizing genepool could belong to the Atlantic and western Cantabrian demes. Another possibility is the recolonization from humid or subhumid slopes and lowlands inland with a germplasm closer to the central Iberian one.

In conclusion, the origin of Courel gene pool is still unknown. Our analyses do not confirm if Courel genepool is an autochthonous or translocated germplasm. Therefore, further research is required to provide evidences of the origin of this stand, including more sampling areas close to, or far from, Courel and a high number of molecular markers.

One explanation of its separation as a different population in K = 8 (Figure 3e) may be attributed to a foundation event by a reduced number of progenies, as occurred in controlled translocations of other organisms (Wright et al., 2014).

4.5 Other germplasm translocated to northwestern Iberia

Finally, the germplasm WM3 present in stands 16, 18 and 24 (Figure 6a,b) probably comes from the naturalization of the traditional variety “Luguesa”, which could be an introduced variety in northwestern Iberia (Fernández-López & Fernández-Cruz, 2015), probably from southeast France (Bouffartigue et al., 2020). “Luguesa” is one of the earliest flushing and producing varieties in northwestern Iberia (Fernández-López, 2014), a characteristic consistent with the hypothesized origin.

Recent within range translocation of different groups of germplasm (WM1, WM2 and WM3) to various Iberian stands, located in areas of orchards, give rise to populations with high allelic richness, gene diversity and observed heterozygosity values (stands 15, 16, 20 and 24; Figure S1), typical of within-range translocations, as has been demonstrated in other species (Wagner et al., 2015). The high frequency of hybrids in these stands is probably an illustration of the ongoing naturalization process.

Dating of the demographic events described should be regarded as an approximation because Diyabc is only useful for a general perspective (Cabrera & Palsbøll, 2017). In addition, instead of a generation time of 100 years, a value of 200 or 300 years could be used, because of chestnut longevity. The species’ extraordinary capacity of resprouting from stumps could have been a survival mechanism for hundreds of years in the unfavorable conditions that must have occurred during pleniglacials. Estimation of historical parameters could also be affected by the repetition of processes during successive interglacial periods. In addition, as the genetic analyses relied on eight microsatellites, which is considered a low number for this kind of study, the results might lack power for investigating more complex models.

In conclusion, the populations of C. sativa in southeastern Europe have a genetic structure, compatible with a natural origin, in which signs of population contractions and expansions caused by climatic oscillations since the Late Miocene have been imprinted, with allopatrically differentiated demes and hybrid demes formed during secondary re-encounters. It has long been assumed that there was a simultaneous movement of chestnut germplasm across Europe (Pitte, 1986). Our research provides arguments in support of this theory by identifying possibly translocated germplasm that in this analysis has been assigned to what we call the western Mediterranean population.

However, the large areas of orchards occupying thousands of hectares have a limited representation in this research. Their genetic structures were probably mostly modified by translocated germplasm, its hybridization with the local germplasm, the effects of artificial selection and the use of clonal propagation, resulting in a variety of different landraces (Camacho-Villa et al., 2005).

ACKNOWLEDGEMENTS

Sampling and genotyping were carried out over a period of time with funds from various sources: the INTERREG project “CASTANEAREG”; INCITE, Xunta de Galicia, project “The evolutive origin of the European chestnut in the Atlantic Galicia”; and INIA, National Institute of Agronomic Research, “Genetic structure of Castanea sativa populations”. We are grateful to Roberto Costas Gándara for his work with Figures 1-6. We thank Lucy Muir, PhD from Edanz Group (https://en-author-services.edanzgroup.com/ac) for editing a draft of this manuscript. The linguistic corrections were paid by Josefa Fernández-López.

    AUTHOR CONTRIBUTIONS

    Josefa Fernández-López designed the research and performance. Josefa Fernández-López, Javier Fernández-Cruz and B. Míguez-Soto were responsible for the data analysis, collection and interpretation. Josefa Fernández-López, Javier Fernández-Cruz and Beatriz Míguez-Soto were responsible for writing the manuscript.

    DATA AVAILABILITY STATEMENT

    Data of 715 individuals from 29 stands for eight microsatellite loci have been submitted to DRYAD (https//:doi.org/105061/dryad.b8gtht7bj). Geographic location of stands and their description is in Table S1.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.