Volume 142, Issue 4 pp. 531-539
Research Article
Full Access

Mitochondrial DNA patterns in the Iberian Northern plateau: Population dynamics and substructure of the Zamora province

Luis Alvarez

Corresponding Author

Luis Alvarez

Unitat Antropologia Biològica, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

Unitat de Antropologia Biològica, Edifici C, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, SpainSearch for more papers by this author
Cristina Santos

Cristina Santos

Unitat Antropologia Biològica, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

Search for more papers by this author
Amanda Ramos

Amanda Ramos

Unitat Antropologia Biològica, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

Search for more papers by this author
Roser Pratdesaba

Roser Pratdesaba

Unitat Antropologia Biològica, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

Departament de Genètica, Facultat de Biologia, Genetica Evoluzionistica, Universitat de Barcelona, 08071 Barcelona, Spain

Search for more papers by this author
Paolo Francalacci

Paolo Francalacci

Dipartamento di Zoologia e Genetica Evolucionista, Università di Sassari, 07100 Sassari, Italy

Search for more papers by this author
María Pilar Aluja

María Pilar Aluja

Unitat Antropologia Biològica, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain

Search for more papers by this author
First published: 01 February 2010
Citations: 21

Abstract

Several studies have shown the importance of recent events in the configuration of the genetic landscape of a specific territory. In this context, due to the phenomena of repopulation and demographic fluctuations that took place in recent centuries, the Iberian Northern plateau is a very interesting case study. The main aim of this work is to check if recent population movements together with existing boundaries (geographical and administrative) have influenced the current genetic composition of the area. To accomplish this general purpose, mitochondrial DNA variations of 214 individuals from a population located in the Western region of the Iberian Northern plateau (the province of Zamora) were analyzed. Results showed a typical Western European mitochondrial DNA haplogroup composition. However, unexpected high frequencies of U5, HV0, and L haplogroups were found in some regions. The analyses of microdifferentiation showed that there are differences between regions, but no geographic substructure organization can be noticed. It can be stated that the differences observed in the genetic pool of the sampled area at regional level results from the mixture of different populations carrying new lineages into this area at different points in history. Am J Phys Anthropol 142:531–539, 2010. © 2010 Wiley-Liss, Inc.

Studies on European genetic diversity have focused mainly on large-scale variation and interpretations based on prehistoric events. However, the complex history of the Iberian Peninsula provides a suitable region to examine the demographic impact of recent historical events on the current genetic landscape. In this regard, Alvarez et al. (2009) and Adams et al. (2008), using Y chromosome data, proved how genetic diversity can be affected by recent population movements. Moreover, Larruga et al. (2001), using mitochondrial DNA (mtDNA) data, showed that recent time isolation, due to cultural differences, clearly influenced the genetic composition of Maragatos, a population located in the western mountainous region of the León province (Northern Spain).

Archaeological and historical data, available for the North-Western extreme of the Iberian plateau (see Fig. 1), reveals a temporal continuum for human activity since the first human settlements, dating from the mid-Pleistocene period (Santoja, 1992), to the present. One of the periods that critically affected the demographic history of the area spanned from the mid 8th century to the 11th century, when land along the Duero River line (Fig. 1a) remained practically uninhabited, becoming a defensive corridor between the Astur kingdom in the North and the Muslim kingdom in the South.

Details are in the caption following the image

Geographical location of the Northern Plateau in the Iberian Peninsula. (a) Main rivers of the area; (b) position of the Zamora Province (gray color); (c) regional subdivision of the Zamora Province (codes for the regions as summarized in Table 1).

Concerning the Zamora province (Fig. 1b), historical data indicate that the territories gained by Northern kingdoms on the North of the Duero River at the end of the 8th century [present day Aliste, Benavente, Campos-Pan, and Sanabria regions (Fig. 1c)] were resettled by individuals from two main origins (Delibes and Moreta, 1995): (1) Mozarabs from the Southern Muslim kingdoms (town of Coria and cities of Toledo, Madrid, and Mérida); and (2) people from the North-Western kingdoms (present-day territories of Asturias, Galicia, and León). The southern areas of the Duero River [present-day Sayago and Bajo Duero regions (Fig. 1c)], were only definitively incorporated into the Astur kingdom in the 11th century, when resettlement of the area was promoted. Based on historical sources (Delibes and Moreta, 1995), three main locations have been proposed as the origins of the settlers: (1) North-Western Iberia, including people from the Northern regions of the Zamora province; (2) Al-Ándalus, mainly Mozarabs; and (3) Southern France, including individuals from areas known today as Gascony, Poitou, Périgord, Montpellier, Provence, and Lombardy.

As mentioned earlier, the North-Western extreme of the Iberian plateau, and within this the Zamora province, appears as a very interesting case-study population, which can be used to evaluate how relatively recent historical events could influence the genetic structure of human populations. Moreover, the geography, orography, and demography of the Zamora province also present interesting characteristics that could be used to test several additional hypotheses concerning the genetic substructure of the region.

To date, there are no genetic studies centered on the North-Western extreme of the Iberian plateau; thus, an integrative study of the region using different genetic systems is being undertaken. In this study, mtDNA variations of each of the six regions of the Zamora province are presented. The main goals of this work are (a) to infer if the different resettlements that took place during the period that ranged from the 8th to the 11th centuries are reflected in the current genetic composition of the area; (b) to check if the recent depopulation of the province (INE, 2008), mainly occurring in the Sanabria region during the 20th century, has had effects on the present-day genetic diversity; and (c) to infer the impact of natural and political boundaries on the genetic microdifferentiation of the province.

MATERIALS AND METHODS

Samples and comparative data sets

Two hundred and fourteen maternally unrelated individuals, born and with maternal origin in the Zamora province (confirmed up to the third generation) were sampled in regional health centers. The origin of the samples was defined taking into account the birthplace of the grandmother in one of the six regions in which the Zamora province can be subdivided, 32 samples from Aliste, 37 from Bajo Duero, 40 from Benavente, 41 from Campos-Pan, 31 from Sanabria, and 33 from Sayago. For all voluntary donors, appropriate informed consent and the birth places of all their known maternal ancestors up to the third generation were obtained under strictly confidential circumstances.

Total DNA from blood samples was extracted using the Jetquick Blood & Cell kit (Genomed) according to the manufacturer's specifications. For each sample, the mtDNA hypervariable region I (HVRI) was amplified and sequenced, between positions 16,024 and 16,399, and coding region informative polymorphisms (mtDNA positions: 1,018; 3,010; 4,580; 6,776; 7,028; 8,994; 9,055; 10,238; 11,251; 12,308; 12,705; 14,766) for haplogroup (Hg) assignment were analyzed by PCR-RFLP as previously detailed by Santos et al. (2003, 2004). Moreover, positions 10,400 and 10,873 were also analyzed; the fragment enclosing these mtDNA positions was amplified and sequenced using primers and conditions described by Ramos et al. (2009). Sequence reactions were carried out using the sequencing kit BigDye Terminator v.3 (Applied Biosystems) according to the manufacturer's specifications and run in an ABI 3130XL sequencer (Genomics Unit, Universitat Autònoma de Barcelona).

For comparative purposes 1,023 mtDNA HVRI Iberian sequences (nucleotide positions 16,024–16,383) were compiled and used for the diversity and genetic distance analysis (populations, sample size, and references are listed in Table 1). Additionally, 341 mtDNA HVRI sequences belonging to African macro-Hg L (nucleotide positions 16,090–16,365) were used for the shared haplotype (Ht) analysis of the African lineages. In this last dataset, original population data were grouped into the following geographic areas: Iberian Peninsula (Pereira et al., 2005); Northern Africa (Corte-Real et al., 1996; Rando et al., 1998; Krings et al., 1999; Brakez et al., 2001); South-eastern Africa (Pereira et al., 2001; Salas et al., 2002); Macaronesia (Rando et al., 1999; Brehm et al., 2002, 2003; Santos et al., 2003); and Central-West Africa (Watson et al., 1997; Rando et al., 1998; Plaza et al., 2004; Rosa et al., 2004).

Table 1. Number of inhabitants and mtDNA HVRI diversity (from nucleotide positions 16,024–16383) in the Zamora province and in the population used for comparison
Population Code No. inhabitants N K (% K) S (% S) Ĥ ± sd π ± sd θK θS
Zamora ZA 197,221 214 107 (50.00) 80 (22.22) 0.9510 ± 0.0112 0.01485 ± 0.00796 84.5001 13.4661
 Aliste AL 14,090 32 22 (68.75) 35 (09.72) 0.9577 ± 0.0235 0.01686 ± 0.00917 29.8144 8.6908
 Bajo Duero BD 21,688 37 24 (64.85) 34 (09.44) 0.9384 ± 0.0310 0.01313 ± 0.00730 28.4963 8.1446
 Benavente BV 40,462 40 29 (72.50) 39 (10.83) 0.9718 ± 0.0158 0.01219 ± 0.00683 45.8968 9.1688
 Campos-Pan CP 34,240 41 29 (70.73) 40 (11.11) 0.9659 ± 0.0183 0.01600 ± 0.00869 42.7824 9.3490
 Sanabria SN 10,283 31 20 (64.52) 36 (10.00) 0.8581 ± 0.0635 0.01210 ± 0.00684 23.2933 9.0113
 Sayago SY 9,786 33 22 (66.67) 27 (07.50) 0.9659 ± 0.0178 0.01825 ± 0.00984 27.6781 6.6527
Mainland Spain MS 22,679,772 528 238 (45.08) 124 (34.44) 0.9463 ± 0.0079 0.01226 ± 0.00670 166.2373 18.1144
 Andalusia, AN 8,202,220 99 76 (76.77) 74 (20.56) 0.9734 ± 0.0114 0.01532 ± 0.00823 147.9744 14.3209
 Basque Country, BQ 2,157,112 100 44 (44.00) 41 (11.39) 0.9287 ± 0.0186 0.00961 ± 0.00548 29.4604 7.9190
 Castile CA 1,656,912 38 29 (76.32) 38 (10.56) 0.9644 ± 0.0213 0.01290 ± 0.00719 54.1794 9.0442
 Catalonia CT 7,364,078 46 28 (58.33) 35 (09.72) 0.9179 ± 0.0345 0.01214 ± 0.00678 29.4099 7.9637
 Galicia, GA 2,784,169 135 84 (62.22) 75 (20.83) 0.9544 ± 0.0138 0.01216 ± 0.00669 94.0145 13.6892
 Leon LE 500,200 61 44 (72.13) 53 (14.72) 0.9410 ± 0.0256 0.01220 ± 0.00677 69.3584 11.3251
 Maragatos MG 15,081 49 24 (48.98) 34 (09.44) 0.8954 ± 0.0362 0.01118 ± 0.00630 17.9520 7.6254
North Portugal MNP 3,294,006 187 114 (60.96) 85 (23.61) 0.9496 ± 0.0128 0.01499 ± 0.00803 122.9941 14.6409
 Braga BR 862,191 41 34 (82.93) 43 (11.94) 0.9793 ± 0.0150 0.01626 ± 0.00881 90.9577 10.0501
 Bragança BG 142,049 42 28 (66.67) 36 (10.00) 0.9489 ± 0.0246 0.01544 ± 0.00841 35.5456 8.3664
 Porto PO 1,820,752 67 44 (65.67) 35 (16.39) 0.9417 ± 0.0233 0.01467 ± 0.00796 54.6198 12.3575
 Viana do Castelo VC 251,676 17 14 (82.35) 29 (08.06) 0.9559 ± 0.0436 0.01668 ± 0.00936 34.6813 8.5780
 Vila Real VL 217,338 20 13 (65.00) 22 (06.11) 0.9105 ± 0.0538 0.00994 ± 0.00588 15.0011 6.2011
Center Portugal MCP 2,413,312 179 107 (59.78) 87 (24.17) 0.9610 ± 0.0107 0.01436 ± 0.00773 111.2688 15.0994
South Portugal MSP 4,419,562 183 108 (59.02) 84 (23.33) 0.9618 ± 0.0102 0.01662 ± 0.00881 109.7769 14.5229
  • N, sample size; K, number of different haplotypes; S, number of polymorphic sites; Ĥ, gene diversity; π, nucleotide diversity; θK, theta estimator based on the number of different haplotypes; θS, theta estimator based on the number of polymorphic sites.
  • a Present study.
  • b Larruga et al., 2001.
  • c Plaza et al., 2003.
  • d Bertranpetit et al., 1995.
  • e Alfonso-Sanchez et al., 2008.
  • f Salas et al., 1998.
  • g Gonzalez et al., 2003.
  • h Pereira et al., 2004.

Data analysis

Samples were assigned to Hgs using the combined information of HVRI and coding region variations following the updated phylogenetic classification proposed by van Oven and Kayser (2009).

The Hg frequencies were determined by counting and the Bayesian 0.95 credible region (95% CR) was calculated using the SAMPLING programme (Macaulay, personal communication). The Hg distribution observed in the six regions of Zamora was compared using the exact test of population differentiation, as described in Raymond and Rousset (1995). To assess global differences in the Hg composition of a specific area, correspondence analysis, using Hgs and regions, and χ2 tests were conducted in the SPSS ver. 15.0.1 software (SPSS Inc.).

Because the impact of genetic drift on the distribution of mtDNA Hts in a population is not equally captured by different diversity indicators, HVRI sequences between positions 16,024 and 16,383 were used to estimate several standard and molecular diversity indices using Arlequin 3.1 software (Excoffier et al., 2005), namely, the Gene diversity (Ĥ) (Nei, 1987), the number of different Ht (K), the number of polymorphic sites (S), the nucleotide diversity (π) (Tajima, 1983; Nei, 1987), and the theta estimators based on the number of polymorphic sites (θS) (Watterson, 1975) and on the number of different Hts (θk) (Ewens, 1972), this last estimator being more sensitive to genetic drift according to Helgason et al. (2003). The Tamura and Nei (1993) nucleotide substitution model with a gamma correction (a = 0.205) (Bandelt et al., 2006) was used.

Inferences regarding past demographic effects on the genetic variation in current populations were conducted by comparing mismatch distributions of pairwise nucleotide differences between Hts to those expected under a sudden population expansion model (Slatkin and Hudson, 1991; Rogers and Harpending, 1992; Rogers, 1995). Statistical significance for the mismatch distributions was obtained using a goodness-of-fit test based on thesum of squared deviations between the observed and expected distributions (Schneider and Excoffier, 1999) and the Harpending's raggedness index (rg) (Harpending, 1994), after 1,000 simulations, using the estimated parameters of the expected distribution for a population expansion. Demographic history was further assessed using the Tajima's D statistic (Tajima, 1989).

To investigate the relation between populations and the effect of natural boundaries on the genetic structure, AMOVA analyses (Excoffier et al., 1992), using both Hg frequencies and HVRI sequence variation, were conducted, with the Arlequin 3.1 software (Excoffier et al., 2005), to check all possible based geographical structures, namely: (1) all the regions within one group; (2) regions on the north of the Duero River versus southern regions; (3) Eastern regions versus western regions; and (4) regions divided by the Duero and Esla rivers.

To analyze the situation of the Zamora regions in the context of the adjacent Iberia populations (Galicia, León, and Bragança), and although the sampling unities are different, Slatkin's linearized FST pairwise genetic distance (Slatkin, 1995) was calculated, using HVRI sequences, with the Arlequin 3.1 (Excoffier et al., 2005). The matrices obtained were then represented in two-dimensional space by means of the multidimensional scaling (MDS) procedure using the SPSS ver. 15.0.1 (SPSS Inc.). Moreover, Barrier software vs. 2.2, which implements the Monmonier algorithm (Manni et al., 2004), was used to calculate barriers relating to geography and the Slatkin's linearized FST pairwise genetic distance (Slatkin, 1995). To assess the robustness of calculated barriers, 100 resampled bootstrap data sets were generated using SAS software (SAS Institute Inc.) [Calculation available on request].

RESULTS

Hg and HVRI variation

Detailed mtDNA results are presented as online supporting material (Supporting Information Table 1); the 214 sampled individuals represent 123 distinct Ht.

Frequencies of mtDNA Hg found in the Zamora province and in each region are listed in Table 2. The Hg distribution in the whole province showed, in general, typical frequencies for a Western European population; however, unexpected high frequencies of haplogroups (Hgs) HV0 and L were found in the Bajo Duero and Sayago regions (χ2 test, P = 0.001 and P < 0.000, respectively), and a high frequency of Hg U5 was found in the Aliste region (χ2 test, P = 0,003) when compared with the rest of regions.

Table 2. Hg distribution on the six regions of the Zamora province
AL (N = 32) BD (N = 37) BV (N = 40) CP (N = 41) SN (N = 31) SY (N = 33) Total ZA (N = 214)
L1b 0.0 (0.1−10.6) 0.0 (0.1−9.3) 0.0 (0.1−7.0) 0.0 (0.1−8.4) 3.2 (0.8−16.2) 9.1 (3.3−23.7) 1.9 (0.8−4.7)
L2b 0.0 (0.1−10.6) 5.4 (1.7−17.7) 0.0 (0.1−7.0) 0.0 (0.1−8.4) 0.0 (0.1−10.9) 0.0 (0.1−10.3) 0.9 (0.3−3.3)
L3b 0.0 (0.1−10.6) 2.7 (0.6−13.8) 0.0 (0.1−7.0) 0.0 (0.1−8.4) 0.0 (0.1−10.9) 9.1 (3.3−23.7) 1.9 (0.8−4.7)
M1 0.0 (0.1−10.6) 0.0 (0.1−9.3) 0.0 (0.1−7.0) 2.4 (0.6−12.6) 0.0 (0.1−10.9) 0.0 (0.1−10.3) 0.5 (0.1−2.6)
N1b 3.1 (0.7−15.8) 0.0 (0.1−9.3) 0.0 (0.1−7.0) 2.4 (0.6−12.6) 0.0 (0.1−10.9) 0.0 (0.1−10.3) 0.9 (0.3−3.3)
I 0.0 (0.1−10.6) 0.0 (0.1−9.3) 0.0 (0.1−7.0) 4.9 (1.5−16.2) 3.2 (0.8−16.2) 0.0 (0.1−10.3) 1.4 (0.5−4.0)
W 0.0 (0.1–10.6) 2.7 (0.6–13.8) 0.0 (0.1–7.0) 2.4 (0.6–12.6) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.9 (0.3–3.3)
X 0.0 (0.1–10.6) 2.7 (0.6–13.8) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 3.0 (0.7–15.3) 0.9 (0.3–3.3)
R0 0.0 (0.1–10.6) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 3.2 (0.8–16.2) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
R0a 0.0 (0.1–10.6) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 2.4 (0.6–12.6) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
R1 3.1 (0.7–15.8) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
HV0 3.1 (0.7–15.8) 8.1 (2.9–21.4) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 3.2 (0.8–16.2) 15.2 (6.8–31.1) 4.7 (2.6–8.4)
HV 3.1 (0.7–15.8) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 3.2 (0.8–16.2) 0.0 (0.1–10.3) 0.9 (0.3–3.3)
V 3.1 (0.7–15.8) 8.1 (2.9–21.4) 5.0 (1.5–16.5) 2.4 (0.6–12.6) 3.2 (0.8–16.2) 3.0 (0.7–15.3) 4.2 (2.3–7.8)
H* 3.1 (0.7–15.8) 5.4 (1.7–17.7) 30.0 (18.1–45.5) 7.3 (2.7–19.5) 16.1 (7.2–32.8) 9.1 (3.3–23.7) 12.1 (8.4–17.2)
H1 18.8 (9.0–35.5) 18.9 (9.6–34.3) 15.0 (7.2–29.2) 24.4 (13.9–39.5) 22.6 (11.5–40.0) 3.0 (0.7–15.3) 17.3 (12.8–22.9)
H3 9.4 (3.4–24.3) 10.8 (4.4–24.8) 10.0 (4.1–23.1) 9.8 (4.0–22.6) 3.2 (0.8–16.2) 15.2 (6.8–31.1) 9.8 (6.5–14.5)
J* 12.5 (5.1–28.2) 8.1 (2.9–21.4) 2.5 (0.6–12.9) 4.9 (1.5–16.2) 9.7 (3.5–25.0) 3.0 (0.7–15.3) 6.5 (4.0–10.7)
J1 0.0 (0.1–10.6) 2.7 (0.6–13.8) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
J1b1 0.0 (0.1–10.6) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 4.9 (1.5–16.2) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.9 (0.3–3.3)
J2a1 0.0 (0.1–10.6) 0.0 (0.1–9.3) 5.0 (1.5–16.5) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.9 (0.3–3.3)
J2b 0.0 (0.1–10.6) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 7.3 (2.7–19.5) 0.0 (0.1–10.9) 9.1 (3.3–23.7) 2.8 (1.3–6.0)
T* 0.0 (0.1–10.6) 2.7 (0.6–13.8) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
T1 3.1 (0.7–15.8) 0.0 (0.1–9.3) 2.5 (0.6–12.9) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.9 (0.3–3.3)
T2b 6.3 (1.9–20.2) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 2.4 (0.6–12.6) 3.2 (0.8–16.2) 0.0 (0.1–10.3) 1.9 (0.8–4.7)
T2c 0.0 (0.1–10.6) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 3.2 (0.8–16.2) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
T2e 0.0 (0.1–10.6) 2.7 (0.6–13.8) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 0.5 (0.1–2.6)
U* 0.0 (0.1–10.6) 0.0 (0.1–9.3) 7.5 (2.7–19.9) 2.4 (0.6–12.6) 6.5 (2.0–20.8) 0.0 (0.1–10.3) 2.8 (1.3–6.0)
U2 0.0 (0.1–10.6) 0.0 (0.1–9.3) 2.5 (0.6–12.9) 4.9 (1.5–16.2) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 1.4 (0.5–4.0)
U5 12.5 (5.1–28.2) 2.7 (0.6–13.8) 7.5 (2.7–19.9) 7.3 (2.7–19.5) 3.2 (0.8–16.2) 6.1 (1.9–19.7) 6.5 (4.0–10.7)
U5a 6.3 (1.9–20.2) 0.0 (0.1–9.3) 2.5 (0.6–12.9) 2.4 (0.6–12.6) 3.2 (0.8–16.2) 0.0 (0.1–10.3) 2.3 (1.0–5.3)
U5a1 3.1 (0.7–15.8) 8.1 (2.9–21.4) 2.5 (0.6–12.9) 2.4 (0.6–12.6) 0.0 (0.1–10.9) 6.1 (1.9–19.7) 3.7 (1.9–7.2)
U5b 9.4 (3.4–24.3) 0.0 (0.1–9.3) 2.5 (0.6–12.9) 0.0 (0.1–8.4) 0.0 (0.1–10.9) 0.0 (0.1–10.3) 1.9 (0.8–4.7)
U6a1a 0.0 (0.1–10.6) 0.0 (0.1–9.3) 0.0 (0.1–7.0) 0.0 (0.1–8.4) 3.2 (0.8–16.2) 0.0 (0.1–10.3) 0.5 (0.1–2.6
K 0.0 (0.1–10.6) 8.1 (2.9–21.4) 5.0 (1.5–16.5) 2.4 (0.6–12.6) 6.5 (2.0–20.8) 9.1 (3.3–23.7) 5.1 (2.9–9.0)

As regards the HVRI Hts, Sanabria presents the lowest percentage of distinct Hts (64.52%), followed by Bajo Duero (64.85%) and Sayago (66.67) (Table 1). Moreover, the analysis of shared Hts proves that all the regions present low percentages of shared Hts with the remaining regions (Table 3). Nevertheless, the percentage of shared Hts is higher between adjacent regions with the exception of those which have the Duero River as a border: Campos-Pan and Bajo Duero; Campos-Pan and Sayago; and Aliste and Sayago. In Sanabria, the percentage of private lineages or nonshared Hts between populations (diagonal values in Table 3), revealed low levels, because it shares Ht, mainly the Hg founder motifs, with the rest of the regions.

Table 3. Percentage of Ht shared with other regions
Shared Ht (%)
AL BD BV CP SN SY
AL 54.55 12.50 17.24 13.79 23.81 09.01
BD 13.64 66.67 10.34 10.34 19.05 27.27
BV 22.73 12.50 58.62 17.24 33.33 18.18
CP 18.18 12.50 17.24 58.62 28.57 27.27
SN 22.73 16.67 24.14 20.79 33.33 22.72
SY 09.01 25.00 13.79 10.34 23.81 59.09
  • The percentages in each column were calculated using number of Ht in the regions that represent the column title; each percentage represents the proportion of Ht from the column population that are also present in the row population. Diagonal values represent the percentage of private lineages or non shared Ht between populations (marked in bold).

Diversity indicator values, based on the HVRI sequences, found in regions of the Zamora province and in other Iberian populations are summarized in Table 1. The results for the Zamora province show similar values of gene diversity and nucleotide diversity as the mean of mainland Spain and Portugal, however, the value of θK is lower. In the analyzed regions, Sanabria and Bajo Duero showed small values of gene diversity and also of θK. The gene diversity in Sanabria is in fact the lowest of the Iberian populations used for comparison. Also, mismatch distribution analyses and Tajima neutrality tests (Table 4) point respectively to unimodal patterns of mismatch distributions and significant negative D values.

Table 4. Tajima's D neutrality test and mismatch distribution analysis for the populations study in this work and in Iberia populations considered for comparison
Population Tajima's D SSD Raggedness index
Zamora −2.013 0.00029 0.00704
 Aliste −1.482 0.00341 0.01126
 Bajo Duero −1.735 0.00180 0.01256
 Benavente −2.031 0.00338 0.02634
 Campos-Pan −1.638 0.00062 0.00706
 Sanabria −2.126 0.00921 0.02926
 Sayago −0.683 0.00178 0.01006
Mainland Spain −2.298 0.00046 0.00952
North Portugal −2.106 0.00098 0.00594
Centre Portugal −2.186 0.00042 0.00714
South Portugal −2.018 0.00052 0.00650
  • SSD, sum of square deviations between the observed and the expected mismatch distribution.
  • * P < 0.05.
  • ** P < 0.001.

Population substructure

To investigate the genetic microdifferentiation in the Zamora province, an exhaustive analysis, using different methodologies, was implemented.

The Hg distribution of pairs of regions was compared, and the results of the exact test of population differentiation are presented in Supporting Information Table 2 (online supporting data). After the Bonferroni correction for multiple tests, Sayago showed significant differences compared to Aliste and Benavente, and no differences were found between the other regions. Moreover, Hg and HVRI-based AMOVA revealed that there is significant variation among the six regions (HVRI: FST = 0.01239, P = 0.0029; Hg: FST = 0.01376, P = 0.0059).

To better understand the Hg distribution on a microgeographic level, a correspondence analysis based on the Hg distribution was conducted (see Fig. 2). The European Hg U5 (Macaulay et al., 1999) and its sub-Hgs are well represented in all regions with an extremely high frequency in Aliste (31.25%). Hg HV0 (or pre*V) (Torroni et al., 2001) appears, in the positive portion of dimension 1, associated with the southern regions of Bajo Duero (8.1%) and Sayago (15.2%). Apart from the high frequency of HV0, Bajo Duero and Sayago also present high incidences of sub-Saharan lineages that account for 18.2% of the individuals in Sayago (L1b and L3) and 5.4% in Bajo Duero (L2b).

Details are in the caption following the image

Correspondence analysis based on the haplogroup distribution on each of the six regions of Zamora.

It seems clear, from all the analysis, that Sayago represents a genetic outlier in the Zamora province. Nevertheless, if more attention is placed on correspondence analysis (see Fig. 2), a geographic structure determined mainly by the Duero River could be inferred. However, in the AMOVA analysis of different population groups, using both HVRI and Hg data, no significant differences were detected between the defined areas, and significant differences were found among populations within groups.

The MDS representation of Slatkin's linearized FST pairwise genetic distance (Slatkin, 1995), based on HVRI sequences between the Zamora regions and the adjacent Iberian populations (see Fig. 3), reveals that the majority of the analyzed regions in Zamora are located around the center of axis showing a great homogeneity. However, the Sayago and Aliste regions appear far from the principal population group and there is also a great distance between them. As Sayago and Aliste (and also Sanabria) have a border with the Portuguese region of Bragança, great affinities with this Portuguese region could a priori be expected; however, only Aliste appears near Bragança. Barrier analysis (Manni et al., 2004) was performed to find out more about the geographic and/or political boundaries that could affect the genetic distances observed (see Fig. 4). The analysis of the Zamora regions and the adjacent sampled areas showed three principal barriers in the zone: the first one involves the Sayago region; in accordance to previous results, it represents the stronger barrier based on genetic and geographic distances; the second one separates Bragança from all of the Spanish areas (the areas of Zamora and Galicia), which shows the importance of the Spanish-Portuguese border as a barrier to gene flow between adjacent territories; and the third barrier appears around the Aliste region, isolating it from the rest of the regions.

Details are in the caption following the image

Multidimensional scaling representation of the Slatkin genetic distances, based on HVRI sequences.

Details are in the caption following the image

(a) Situation of the Zamora province and the adjacent areas in the North West portion of the Iberian Peninsula. (b) Three main barriers detected in the Barrier analysis based on Monmonier algorithm elaborates from Stalking genetic distances on 100 bootstrap matrices obtained by randomly resampling the original data. The thickness of each edge of a barrier is proportional to the number of times it was included in the computed barriers (numbers on edges).

African influence

In the province of Zamora, if the total number of African lineages are taken into account (Hgs L1b, L2b, L3, M1, and U6a1a), the contribution represents 5.7% of the total Hg composition. As regards North African lineages, only one mtDNA type belonging to U6a1a Hg was found in the province, more precisely in the Sanabria region. The Ht ZA117 (16172-16183C-16189-16219-16239-16278) is observed in Algeria, Italy, and, inside the Iberian Peninsula, it is found in the North-Western populations of León and Cantabria (Larruga et al., 2001; Maca-Meyer et al., 2003; Pereira et al., 2005; van Oven and Kayser, 2009). There is a low frequency of U6 Hg in Iberia, its presence being higher in the North: 5.35% in Northern Portugal, 2.17% in Galicia, and 8.1% in Maragatos (León) (Salas et al., 1998; Larruga et al., 2001; Pereira et al., 2005). Hg M1 is found in Andalusia and Central Portugal although the Ht ZA6 (16093-16129-16148-16183C-16189-16249-16311), found in Campos-Pan, does not correspond to any of the previously detected Ht in Iberia.

As regards sub-Saharan Hgs (L1b, L2b, and L3b), the high frequency found in the southern regions of Zamora, 18.2% in Sayago and 8.1% in Bajo Duero, is comparable to that described for the South of Portugal, but it does not have any parallels with any other analyzed areas in the Northern part of Iberia (Pereira et al., 2005).

To try to shed some light on the precedence of sub-Saharan Ht in the Iberian Northern plateau, a shared Ht analysis was conducted. Four individuals typed as L1b, representing two different Ht, were found in the province (three in Sayago and one in Sanabria). Ht ZA1 (16126-16187-16189-16223-16264-16270-16278-16293-16311) is shared mainly with Central-West Africa, North Africa, and Macaronesia. Ht ZA2 (16126-16187-16189-16223-16264-16270-16278-16293-16311-16362) is shared with Macaronesia and Iberian samples. The L2b Ht found in the province corresponds to two individuals from the Bajo Duero region, Ht ZA3 (16114A-16129-16213-16223-16278) shared motif with samples from Central-West Africa, Macaronesia, the Iberian Peninsula, and Northern Africa. The two L3b Ht are present in four individuals (one in Bajo Duero and three in Sayago). These share the motif Ht ZA4 (16223-16278-16362) with Central-West Africa and the Iberian Peninsula. On the contrary, Ht ZA5 (16209-16223-16278-16362) is not found in any of the populations considered and differs from the Ht ZA4 by one mutation step. Thus, this Ht could have derived from ZA4 in the peninsula by a new mutation.

DISCUSSION

The mtDNA composition and diversity observed in the Zamora province shows perfect affinity with other Iberian populations. However, some differences between regions are detected.

Concerning the mtDNA diversity observed, the Sanabria region presents the lowest values when compared with other Iberian populations. Because Sanabria has one of the lowest numbers of inhabitants (Table 1), it may be hypothesized that the low diversity observed in this region could be related to the number of inhabitants. Taking into account that, in small populations, drift will be the primary evolutionary force acting, particularly reducing the number of rare alleles (Helgason et al., 2003) and, that the number of inhabitants is related with θK (r = 0.702; P < 0.001), it seems that the differences in the diversity observed in the regions of Zamora are related to population size. Moreover, both Tajima's D and mismatch distribution analysis reveal quite similar demographic histories for the different regions of the province, and it appears that the reduction by ∼70% of Sanabria inhabitants in the last 50 years has not had a detectable impact on the genetic landscape of this region. However, this result must be interpreted with caution, because recent changes in population size may not be detectable in mismatch distribution analyses due to threshold effects, time lags, or earlier demographic events that may mask the effects of recent events (Rogers and Harpending, 1992).

The analyses of genetic microdifferentiation showed that there are differences between regions, both at Hg and HVRI levels, but no geographic substructure organization (determined by rivers) can be noticed, because the observed differences appeared at the regional level mainly in Sayago and Aliste. Moreover, the diversity values obtained in these two regions and the observed intra Hg diversity are not consistent with a profile of isolated populations that could be used as a justification for the observed isolation. Barrier analysis is particularly informative about the differentiation of these two regions (see Fig. 4). Sayago appears as the most differentiated population with respect to both the province and the Portuguese region of Bragança. However, this first barrier presents its thinner portion with the Bajo Duero region due to the high presence of African lineages in both regions in contrast to the rest of the Province. As regards Aliste, this region appears to present some affinity with the Portuguese region of Bragança. The border between those populations is thinner when compared with the other regions that reveal population affinities. However, according to barrier analysis, the political frontier along the entire Portuguese border with Zamora acts as a strong boundary to gene flow, isolating all the Spanish areas from the Portuguese area of Bragança. This point, however, must be further researched with appropriate sample representation, particularly for the Portuguese region of Bragança.

One of the distinctions between the mtDNA composition of the Iberian Peninsula with respect to other European populations is the presence of North African and sub-Saharan lineages [for revision, see Arroyo-Pardo et al. (2007)]. In Zamora, both North African and sub-Saharan mtDNA lineages were found. It has been suggested that U6 and M1 Hgs, detected in low frequencies in Zamora, have been involved in the dispersal of Upper Palaeolithic Levantine people to North Africa along the south Mediterranean coastal areas (Olivieri et al., 2006). In this scenario, prehistoric links between North Africa and Iberia could explain the presence of this Hg in the Northern part of Iberia. The identification of a M1 mtDNA African lineage in a Basque necropolis dating back to the 6th–7th centuries (Izagirre et al., 2005) together with cattle from the Bronze Age (Anderung et al., 2005) with mtDNA African lineage support this hypothesis. However, paying heed to the low diversity of these Hg in Iberia, a more recent North African contribution that claims that it may be due to the flexible procreation between Christians and Muslims (females) (Pereira et al., 2005) is plausible. Another explanation is the relocation of moriscos, a hypothesis recently proposed by Adams et al. (2008) based on Y chromosome data and supported by historical data available for the studied region (Martin, 2003).

As regards sub-Saharan lineages, it is well known that during the 16th–19th centuries, African slaves were captured along the West African Coast and were frequently transported to Cape Verde (Macaronesia region). This archipelago served, from the beginning of the slave trade, as a kind of platform that connected the African continent to Europe, America, and India from which slaves were transported to different regions, including the other Macaronesia archipelagos (Canary, Madeira, and Azores) and mainland Portugal (Comissão Nacional para as Comemorações dos Descobrimentos Portugueses, 1999). Thus, paying heed to the sharing Ht analyses, it seems that the slave trade, during the 16th–19th centuries, better explains the African sub-Saharan lineages found in the Iberian Peninsula (including those found in the Zamora province). However, in contrast to the well- documented presence of slaves in the Portuguese territories (Comissão Nacional para as Comemorações dos Descobrimentos Portugueses, 1999), the same evidence does not exist for mainland Spain. In the Zamora province, there is only one reference to the presence of slaves in the province (Carbajo Martin, 1995). As the Hts found in the area are also shared with North African populations, we cannot discard the possibility that these lineages derived from the North African Muslim permanence in the Iberian Peninsula. A great number of Berber troops relocated their family groups to the gained territories (Salvatierra and Canto, 2008). Thus, this phenomenon could explain the presence of sub-Saharan lineages.

Finally, it is important to emphasize the importance of regional studies, which can reveal results that would be obscured in an overall view of a major area. However, it is worth noting that this kind of strategy could be limited by sample size requirements.

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.