Volume 33, Issue 1 e17190
ORIGINAL ARTICLE
Open Access

Whole-genome SNP genotyping unveils ancestral and recent introgression in wild and domestic goats

Neža Pogorevc

Neža Pogorevc

Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia

Population Genomics Group, Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University of Munich, Martinsried/Planegg, Germany

Search for more papers by this author
Arsen Dotsev

Arsen Dotsev

L.K. Ernst Federal Research Center for Animal Husbandry, Podolsk, Russia

Search for more papers by this author
Maulik Upadhyay

Maulik Upadhyay

Population Genomics Group, Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University of Munich, Martinsried/Planegg, Germany

Search for more papers by this author
Edson Sandoval-Castellanos

Edson Sandoval-Castellanos

Population Genomics Group, Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University of Munich, Martinsried/Planegg, Germany

Search for more papers by this author
Elisabeth Hannemann

Elisabeth Hannemann

Population Genomics Group, Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University of Munich, Martinsried/Planegg, Germany

Search for more papers by this author
Mojca Simčič

Mojca Simčič

Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia

Search for more papers by this author
Aglaia Antoniou

Aglaia Antoniou

Institute of Marine Biology, Biotechnology and Aquaculture (IMBBC), Hellenic Centre for Marine Research (HCMR), Crete, Greece

Search for more papers by this author
Dimitris Papachristou

Dimitris Papachristou

Laboratory of Animal Breeding and Husbandry, Department of Animal Science, School of Animal Biosciences, Agricultural University of Athens, Athens, Greece

Search for more papers by this author
Panagiota Koutsouli

Panagiota Koutsouli

Laboratory of Animal Breeding and Husbandry, Department of Animal Science, School of Animal Biosciences, Agricultural University of Athens, Athens, Greece

Search for more papers by this author
Siham Rahmatalla

Siham Rahmatalla

Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Animal Breeding and Molecular Genetics, Humboldt-Universität zu Berlin, Berlin, Germany

Search for more papers by this author
Gudrun Brockmann

Gudrun Brockmann

Albrecht Daniel Thaer-Institute for Agricultural and Horticultural Sciences, Animal Breeding and Molecular Genetics, Humboldt-Universität zu Berlin, Berlin, Germany

Search for more papers by this author
Johann Sölkner

Johann Sölkner

Division of Livestock Sciences, Department of Sustainable Agricultural Systems, University of Natural Resources and Life Sciences, Vienna, Austria

Search for more papers by this author
Pamela Burger

Pamela Burger

Department of Interdisciplinary Life Sciences, Research Institute of Wildlife Ecology, Vetmeduni Vienna, Vienna, Austria

Search for more papers by this author
Petros Lymberakis

Petros Lymberakis

Natural History Museum of Crete, School of Sciences and Engineering, University of Crete, Irakleio, Greece

Search for more papers by this author
Nikos Poulakakis

Nikos Poulakakis

Natural History Museum of Crete, School of Sciences and Engineering, University of Crete, Irakleio, Greece

Biology Department, School of Sciences and Engineering, University of Crete, Irakleio, Greece

Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology – Hellas (FORTH), Irakleio, Greece

Search for more papers by this author
Iosif Bizelis

Iosif Bizelis

Laboratory of Animal Breeding and Husbandry, Department of Animal Science, School of Animal Biosciences, Agricultural University of Athens, Athens, Greece

Search for more papers by this author
Natalia Zinovieva

Natalia Zinovieva

L.K. Ernst Federal Research Center for Animal Husbandry, Podolsk, Russia

Search for more papers by this author
Simon Horvat

Simon Horvat

Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia

Search for more papers by this author
Ivica Medugorac

Corresponding Author

Ivica Medugorac

Population Genomics Group, Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University of Munich, Martinsried/Planegg, Germany

Correspondence

Ivica Medugorac, Population Genomics Group, Department of Veterinary Sciences, Faculty of Veterinary Medicine, Ludwig-Maximilians-University of Munich, Lena-Christ-Straße 48, Martinsried/Planegg 82152, Germany.

Email: [email protected]

Search for more papers by this author
First published: 01 November 2023
Citations: 1
Handling Editor: Josephine Pemberton

Abstract

After the domestication of goats around 10,000 years before the present (BP), humans transported goats far beyond the range of their wild ancestor, the bezoar goat. This brought domestic goats into contact with many wild goat species such as ibex and markhor, enabling introgression between domestic and wild goats. To investigate this, while shedding light on the taxonomic status of wild and domestic goats, we analysed genome-wide SNP data of 613 specimens from 14 taxonomic units, including Capra hircus, C. pyrenaica, C. ibex (from Switzerland, Austria, Germany and Slovenia), C. aegagrus aegagrus, C. a. cretica, C. h. dorcas, C. caucasica caucasica, C. c. severtzovi, C. c. cylindricornis, C. falconeri, C. sibirica sibirica, C. s. alaiana and C. nubiana, as well as Oreamnos americanus (mountain goat) as an outgroup. To trace gene flow between domestic and wild goats, we integrated genotype data of local goat breeds from the Alps as well as from countries such as Spain, Greece, Türkiye, Egypt, Sudan, Iran, Russia (Caucasus and Altai) and Pakistan. Our phylogenetic analyses displayed a clear separation between bezoar-type and ibex-type clades with wild goats from the Greek islands of Crete and Youra clustered within domestic goats, confirming their feral origin. Our analyses also revealed gene flow between the lineages of Caucasian tur and domestic goats that most likely occurred before or during early domestication. Within the clade of domestic goats, analyses inferred gene flow between African and Iberian goats. The detected events of introgression were consistent with previous reports and offered interesting insights into the historical relationships among domestic and wild goats.

1 INTRODUCTION

The first sedentary societies began domesticating goats (Capra hircus) in the Fertile Crescent around 10,500 years before present (BP) (Zeder & Hesse, 2000). Like other livestock species, domestic goats were transported far beyond the range of their wild ancestor, the bezoar goat, whose distribution is restricted to Southwest Asia. In some of the new environments to which domestic goats were introduced, they found other congeneric species, enabling the possibility of gene flow between domestic and wild goats. Gene flow between domestic and wild relatives has been reported in several livestock species (Cao et al., 2021; Chen et al., 2018; Cubric-Curik et al., 2021; Medugorac et al., 2017), and goats are not an exception. Domestic goats have been shown to carry signatures of interbreeding with both Alpine and Iberian ibexes (Alasaad et al., 2012; Cardoso et al., 2021; Giacometti et al., 2004; Grossen et al., 2014), and with markhor (Hammer et al., 2008; Li et al., 2022). Interbreeding among ancestral caprine lineages may also have occurred (Daly et al., 2022), but such reports are isolated, and the taxonomic status of some species is still under review. Therefore, it is essential to update the phylogenetic and taxonomic state of knowledge of the genus Capra.

The genus Capra consists of at least eight recognized species, which are spread in mountainous habitats of Africa, Asia and Europe. They are divided into three clades: true goats, markhors and ibexes (Groves & Grubb, 2011). True goats include domestic (C. hircus) and bezoar goats (C. aegagrus). Bezoar goats are discontinuously distributed in Southwest Asia in regions such as Pakistan, Iran, Turkmenistan and the Caucasus. They are listed as vulnerable and have gone extinct in several countries (Rahim, 2016). Bezoar goats have two subspecies, C. a. aegagrus and C. a. blythi. The status of other subspecies, such as those from the Greek islands of Crete and Youra, is still contentious (Masseti, 2009). Markhors (C. falconeri) are found in the mountainous regions of Afghanistan, India, Pakistan, Tajikistan, Turkmenistan and Uzbekistan, including the Sulaiman Mountains and the Himalayas. Markhor is listed as near threatened globally (Michel et al., 2014) and endangered in Pakistan (Ahmad & Nabi, 2022). The third clade of the genus Capra, the ibexes, has six recognized species: (1) the Walia ibex (C. walie), which is found only in the Simien Mountains of Ethiopia; (2) the Nubian ibex (C. nubiana), whose range stretches along both sides of the Red Sea; (3) the Siberian ibex (C. sibirica), which has four subspecies (C. s. sibirica, C. s. alaiana, C. s. hagenbecki and C. s. sakeen) that are found in mountain ranges across Central, East and South Asia (Castelló, 2016; Groves & Grubb, 2011); (4) the Caucasian tur (C. caucasica), which is endemic to the Greater Caucasus range and has several ecotypes (C. c. caucasica, C. c. cylindricornis and C. c. severtzovi) (Dotsev et al., 2021); (5) the Iberian ibex (C. pyrenaica), which is endemic to the Iberian Peninsula; and (6) the Alpine ibex (C. ibex), which is endemic to the Alps and has undergone a severe population decline to about 100 individuals that were kept in a reserve and later reintroduced to the Alps (Parrini et al., 2009). The Iberian ibex experienced similar bottlenecks, up to the extinction of two of the four reported subspecies (C. p. lusitanica and C. p. pyrenaica were extirpated, C. p. hispanica and C. p. victoriae are still present) (Acevedo & Real, 2011).

The systematics and phylogeny of Capra were initially approached by morphological analyses and later by the analysis of mitochondrial DNA (mtDNA) and Y-chromosome markers (Luikart et al., 2001; Manceau et al., 1999; Pidancier et al., 2006; VarGoats Consortium et al., 2022). Unsurprisingly, the maternal and paternal phylogenetic reconstructions disagreed, presumably due to different mutation rates and different pervasiveness of lineage sorting associated with different male and female effective population sizes, different reproductive histories and sex-biased introgression. For the same reasons, the evolutionary histories of the mtDNA and the Y-chromosome are expected to show some differences from those inferred from autosomal variation, but studies analysing whole-genome data across most species and subspecies of the genus Capra are not available yet.

In this study, we compiled the most comprehensive SNP dataset of wild and domestic goats to date (Figure 1). Subsequently, we studied their phylogenetic relationships and investigated the patterns of gene flow among species, including wild and domestic goats that are sympatric, as well as among domestic goat breeds. Finally, we discussed the causes and consequences of the inferred events of gene flow and the time when they most likely occurred.

Details are in the caption following the image
Sample origin. The triangles represent wild species (names shown) and the circles represent domestic goats.

2 MATERIALS AND METHODS

For this study, blood and tissue samples from 173 goat specimens and 4 mountain goats (Oreamnos americanus) were collected. The goat specimens included 79 Alpine ibexes, 5 Iberian ibexes, 4 Youra wild goats, 6 Cretan wild goats, 1 Nubian ibex, 2 bezoar wild goats, 14 markhors, 4 Mid-Asian ibexes, 7 Siberian ibexes and 51 domestic goats from different Greek populations (Table S1).

Genomic DNA was extracted using the Omega E.Z.N.A.® Tissue DNA Kit or ReliaPrep™ Blood gDNA Miniprep System (Promega) according to the manufacturer's instructions. Then, we genotyped the collected samples, including mountain goats, using the Illumina Goat SNP50 BeadChip (Tosser-Klopp et al., 2014) and complemented those genotypes with published data from animals that were genotyped with the same SNP chip (Table 1). These published datasets included 15 Caucasian turs, 7 Nubian ibexes (Sudan), 4 Cretan wild goats and 389 domestic goats of different breeds from Russia, Iran, Egypt, Türkiye, Spain, Sudan, Pakistan, Switzerland, Austria and Slovenia (Burren et al., 2016; Colli et al., 2018; Deniskova et al., 2021; Dotsev et al., 2021; Hassan et al., 2018; Pogorevc et al., 2021; Rahmatalla et al., 2017). Domestic populations were included according to their overlapping or spatial proximity to the ranges of wild goat species. For instance, we used Spanish breeds that were broadly sympatric with Iberian ibex, and Alpine goat breeds that overlapped with Alpine ibex. Furthermore, we downloaded the whole genome sequences of five Iberian ibexes (NCBI accession numbers SAMEA6675493, SAMN10736154, SAMN10736151, SAMN10736152 and SAMN10736153), four Nubian ibexes (NCBI acc. numbers SAMEA6675494, SAMN16674508, SAMN10736155 and SAMN10736156), one markhor (NCBI acc. number SAMEA6675502) and four Mid-Asian ibexes (NCBI acc. numbers SAMN06233877, SAMN10736159, SAMN10736158 and SAMEA6675501) (Chebii et al., 2020; Chen et al., 2019; Denoyelle et al., 2021; Grossen et al., 2020) and extracted the exact set of SNP markers present on the Illumina Goat SNP50 BeadChip after aligning the samples against the latest goat genome assembly, ARS1 (Bickhart et al., 2017). In domestic populations, closely related individuals and outliers that could have been imports from other breeds were excluded (following Pogorevc et al., 2021). Our full dataset contained whole genome SNP data of 613 individuals from 10 species, 8 subspecies and 26 breeds according to the nomenclature of the “Bovids of the world” (Table 1) (Castelló, 2016).

TABLE 1. Sample collection.
Short name Species/breed Origin Scientific name Nr Sample source Genotype source
Mountain goat Mountain goat North America Oreamnos americanus 4 Z This study
Iberian ibex Iberian ibex Spain, France Capra pyrenaica 10 W Grossen et al. (2020), Denoyelle et al. (2021)
Alpine ibex Alpine ibex Austria, Germany, Slovenia, Switzerland Capra ibex 79 W, Z This study
Youra wild Youra wild goat Greece Capra hircus dorcas 4 W This study
Cretan wild Cretan wild goat Greece Capra aegagrus cretica 10 W, Z This study, Hassan et al. (2018)
Nubian ibex Nubian ibex Egypt, Israel, Saudi Arabia, Sudan Capra nubiana 12 W, Z This study, Grossen et al. (2020), Hassan et al. (2018), Denoyelle et al. (2021), Chebii et al. (2020)
E-Caucasian tur East Caucasian tur Russia Capra caucasica cylindricornis 5 W Dotsev et al. (2021)
M-Caucasian tur Mid Caucasian tur Russia Capra caucasica severtzovi 5 W Dotsev et al. (2021)
W-Caucasian tur West Caucasian tur Russia Capra caucasica caucasica 5 W Dotsev et al. (2021)
Bezoar Bezoar wild goat Iran Capra aegagrus aegagrus 9 W This study, Colli et al. (2018)
Markhor Markhor Pakistan, Tajikistan, Uzbekistan Capra falconeri 15 W This study, Denoyelle et al. (2021)
M-Asian ibex Mid-Asian ibex Kyrgyzstan, Tajikistan Capra sibirica alaiana 8 W This study, Grossen et al. (2020), NCBI BioProject PRJNA438286, Denoyelle et al. (2021)
Siberian ibex Siberian ibex Kazakhstan, Russia Capra sibirica sibirica 7 W This study
Blanca Blanca de Rasquera Spain Capra hircus 10 F Colli et al. (2018)
Bermeya Bermeya Spain Capra hircus 24 F Colli et al. (2018)
Malaguena Malagueña Spain Capra hircus 25 F Colli et al. (2018)
Peacock Peacock Switzerland Capra hircus 11 F Burren et al. (2016)
Chamois-C Chamois Coloured Austria Capra hircus 10 F Pogorevc et al. (2021)
Drežnica Drežnica Slovenia Capra hircus 25 F Pogorevc et al. (2021)
Skyros Skyros Greece Capra hircus 9 F This study
Chios Chios Greece Capra hircus 10 F This study
Lesbos Lesbos Greece Capra hircus 12 F This study
Peloponnese Peloponnese Greece Capra hircus 12 F This study
Crete Crete Greece Capra hircus 8 F This study
Ankara Ankara Türkiye Capra hircus 20 F Colli et al. (2018)
Kil Kil Türkiye Capra hircus 25 F Colli et al. (2018)
Kilis Kilis Türkiye Capra hircus 25 F Colli et al. (2018)
Barki Barki Egypt Capra hircus 25 F Colli et al. (2018)
Oasis Oasis Egypt Capra hircus 25 F Colli et al. (2018)
Saidi Saidi Egypt Capra hircus 25 F Colli et al. (2018)
Tagg Taggar Sudan Capra hircus 20 F Hassan et al. (2018)
Nubian Nubian Sudan Capra hircus 21 F Rahmatalla et al. (2017)
Karachaev Karachaev Russia Capra hircus 19 F Deniskova et al. (2021)
Dagestan-D Dagestan downy Russia Capra hircus 17 F Deniskova et al. (2021)
Dagestan-A Dagestan aboriginal Russia Capra hircus 12 F Deniskova et al. (2021)
Iranian Iranian Iran Capra hircus 9 F Dotsev et al. (2021)
Soviet-M Soviet Mohair Russia Capra hircus 17 F Deniskova et al. (2021)
Koh-e-sulmani Koh-e-sulmani Pakistan Capra hircus 12 F Colli et al. (2018)
Dera-DP Dera Din Panah Pakistan Capra hircus 12 F Colli et al. (2018)
Total 39 14 613
  • Note: Columns show a short name matching figures in the manuscript and Appendix S1, breed or species name, scientific name, country of origin, sample size and sources.
  • a Alternative scientific name is Capra aegagrus dorcas Reichenow, 1888 (Masseti, 2009).
  • b Alternative scientific name is Capra hircus cretica Schinz, 1838 (Masseti, 2009).
  • c The origin of the samples: W = wild, Z = ZOO, F = farm.
  • d Samples were published Hassan et al. (2018) as bezoar, later we discovered they are Cretan wild goats.

Autosomal SNPs with a call rate greater than 95%, 90% non-missing genotypes per marker and 90% non-missing genotypes per individual were kept. After filtering, 46,170 SNPs were used for further analyses.

We estimated the genomic diversity within individuals by means of the individuals' multilocus heterozygosities (MLH) with the R package InbreedR (Stoffel et al., 2016). For comparison purposes, we calculated the mean MLH values and confidence intervals for each taxa and plotted them with the R package ggplot2 (Gómez-Rubio, 2017).

After estimating genetic variation, we evaluated the genomic similarity of our samples by means of the Reynolds's distances (Reynolds et al., 1983) computed with the R package adegenet (Jombart & Ahmed, 2011). The distances were used to construct a neighbour-net in the program SplitsTree v4.18.1 (Huson & Bryant, 2006).

To obtain further insight into the genetic relationships of our samples, we applied a principal component analysis (PCA) based on IBS (identical by state) alleles in the program PLINK v1.9 (Purcell et al., 2007) and plotted the resulting PCs in the R platform (R Core Team, 2018). We also analysed coancestry relationships among our samples using the program ADMIXTURE v1.3 (Alexander et al., 2009) after finding the optimal number of required groups (K), for which we applied a 20-fold cross-validation procedure (--cv = 20). For this analysis, we pruned the set of SNPs by excluding 38,017 redundant SNPs that exhibited pairwise linkage disequilibrium scores (r2) above 0.1 in a window of 50 SNPs (--indep-pairwise 50 10 0.5). The pruning was carried out in PLINK v1.9 (Purcell et al., 2007) and plots were made in Pong (Behr et al., 2016).

To understand the evolutionary relationships among wild and domestic goats, we first inferred a maximum likelihood tree with the program TreeMix v1.13 (Pickrell & Pritchard, 2012). The outgroup was mountain goat (Oreamnos americanus), which is endemic to North America and thus, most likely, has no chance of having had introgression with Capra species. The support of the nodes was estimated by means of 100 replicates of bootstrapping. We used 35 taxonomic units, considering the genetic affinity of wild populations observed with ADMIXTURE, to improve the statistical robustness of this analysis and the following ones. The 35 taxonomic units resulted from merging the two subspecies of Siberian ibex (C. sibirica sibirica and C. s. alaiana) and the three subspecies of Caucasian tur (C. caucasica caucasica, C. c. cylindricornis and C. c. severtzovi).

To further refine our reconstruction of the evolutionary history of wild and domestic goats, we inferred gene flow events among the 35 taxonomic units used in the phylogenetic reconstruction. We used the maximum likelihood method implemented in TreeMix v1.13 (Pickrell & Pritchard, 2012), which requires setting the number of gene flow events a priori. To resolve this, we performed an initial analysis with m = 1 event and then performed replicates by adding one event at a time, until the variance explained by relatedness between taxa reached ~99.8% (a threshold suggested by a simulation study of the original paper; Pickrell & Pritchard, 2012). To assess consistency, we replicated the analysis 10 times with different random seeds. The optimal number of migration edges were obtained and visualized with the R package OptM (Fitak, 2021).

To confirm or disregard the gene flow events inferred with TreeMix, we calculated D-statistics (also termed ABBA-BABA statistics). The D-statistics employ a scheme with four taxonomic units: two sister taxa (P1 and P2) from which P2 is a candidate for having had gene flow with an external group (P3), and an outgroup (O) that is used to identify ancestral alleles. The D-statistics' values range from 0.0 to 1.0, where 0.0 means no gene flow, and their significance is obtained by a block jackknife procedure that yields Z-scores. We used the mountain goat (Oreamnos americanus) as the outgroup and tested different combinations of taxa in P1, P2 and P3. We also computed f-branch statistics (based on the better-known f4-statistics) which predict the excess of alleles that are shared between a target taxon and a specific branch in a given species-tree topology. To make better sense of the patterns obtained with D-statistics, we classified the analyses according to their compliance with D-statistics assumptions, specifically the assumption of lack of gene flow between P1 and P2, the lack of gene flow between P1 and P3 and the symmetrical probability of incomplete lineage sorting in P1 and P2. We plotted the results obtained with the D-statistics, f4-statistics and f-branch statistics with the scripts provided in the program Dsuite (Malinsky et al., 2021). To complement the analysis of D- and f4-statistics, we tested the admixture status of domestic populations with f3-statistics using the program AdmixTools v7.0.2 (Patterson et al., 2012).

Finally, we used two approaches to assess the effect of ascertainment bias on the phylogenetic inference using TreeMix and the inference of gene flow with TreeMix and D-statistics. The first approach compared results from two panels of SNPs extracted from publicly available 58 whole-genome sequences of wild and domestic goats: (i) SNPs from the Goat 50 K array and (ii) the entire set of SNPs. In the second approach, we used simulations to compare the tree topology and the ability of the TreeMix algorithm to retrieve the correct migration edges in the SNP datasets that were created under ascertainment and unascertainment conditions. The simulations produced full sequencing data under a demographic model that incorporated gene flow. They were carried out in the software ARGON v0.1 (Palamara, 2016) (refer to Data S1 for more details).

3 RESULTS

The genetic diversity of domestic goats, measured by the MLH (average = 0.388, range = 0.221–0.439), contrasted with the values of ibex and tur species (average = 0.012, range = 0.002–0.040) (Figure S1). The remaining groups, namely markhor, bezoar, as well as Youra and Cretan wild goats had intermediate values (average = 0.168, range = 0.14–0.33). This apparent separation between true goats and ibexes was congruent with the neighbour-net constructed with Reynolds's distances. In it, ibexes and true goats appeared in a single cluster each, without any taxa cross-clustered, and markhor, Youra and Cretan wild goats appeared at intermediate positions (Figure S2A).

The separation of ibex species and subspecies was also evident in the PCA (Figure S2B). In the parametric space of PC1 and PC2 (58.09% and 9.83% of the variation, respectively), each species constituted a compact cluster that was well separated from the others, with the sole exception of the domestic goats (C. hircus), and the Crete and Youra wild goats, whose clusters overlapped. However, the plot of PC1 and PC3 (5.66% of the variation) partially separated these groups (Figure S2C).

In the ADMIXTURE charts (Figure 2a), all tur subspecies (C. caucasica cylindricornis, C. c. severtzovi and C. c. caucasica), the two Siberian subspecies (C. sibirica sibirica and C. s. alaiana), the Iberian ibex (C. pyrenaica) and most individuals of the Alpine ibex (C. ibex) showed little to no traces of mixed ancestry. However, some Alpine ibexes shared small amounts of origin with the Iberian ibex and Caucasian tur. Outside the ibex clade, the bezoar showed a highly admixed ancestry. A direct interpretation would be that the bezoar had introgression with a number of wild and domestic goats, as has been reported (Alberto et al., 2018), but a more parsimonious explanation is that this is a byproduct of the bezoar being the ancestor of all domestic goats, whose high intra-group homogeneity and inter-group differentiation, driven by intensive artificial selection and breeding practices, highly influenced the definition of ancestry groups by ADMIXTURE. Upon close examination, the six noticeable ancestry groups present in bezoar are prevalent in breeds from Greece, Türkiye, Pakistan, Russia, Egypt, Sudan and in the markhor. The aforementioned populations are sympatric (or nearly sympatric) with bezoar, suggesting a high retention of ancestral genetic variation inherited from bezoar, with the obvious exception of the markhor. This suggests a possible occurrence of introgression, or interbreeding, between bezoar and these populations. Furthermore, our dataset revealed that the three Spanish breeds (Blanca de Rasquera, Bermeya and Malagueña) exhibited genetic admixture between themselves and with goats from Egypt. The Greek populations from Skyros, Chios, Lesbos, Peloponnese and Crete presented a similar set of ancestry groups that can be as well found in the Turkish breeds of Kil, Kilis and Ankara, the most genetically diverse breeds among domestic goats.

Details are in the caption following the image
Pattern of genome-wide ancestral variation among wild and domestic goats. (a) Colour chart of shared ancestral genomic variation inferred by the ADMIXTURE software with an optimum number of inferred groups set to 20 (K = 20); and (b) maximum-likelihood phylogenetic tree of species and domestic breeds inferred with TreeMix. Nodes indicated with red circles were supported by bootstrap values of at least 93%. In the lower left corner, we also plotted the matrix of residuals from the fit of the model with no migrations. The colour bar next to it represents the degree of relatedness between populations and residuals above zero suggests candidates for admixture events.

Our maximum likelihood phylogenetic tree provided deeper insight into the dichotomous pattern suggested by other analyses (Figure 2b). The topology showed a monophyletic clade containing the ibex and tur species, well separated from the lineage leading to the monophyletic group of domestic goats, in which markhor and bezoar appeared as stem groups. In the ibex-tur clade, the Nubian ibex (C. nubiana), the Siberian ibex (C. sibirica) and the Caucasian tur (C. caucasica) appeared paraphyletic to the Alpine and Iberian ibex (C. ibex and C. pyrenaica). In the clade of domestic goats, a geographical East–West split pattern emerged: breeds from Pakistan, Iran, Russia, Türkiye, Egypt and Sudan appeared in one clade, and breeds from Greece, Spain, Slovenia, Austria and Switzerland appeared in the sister clade. Interestingly, the Youra and Cretan wild goats appeared next to Greek domestic goat populations and paraphyletic to all other European domestic goats.

The optimal number of gene flow events to estimate with TreeMix, according to the likelihood change in 10 analysis replicates (Fitak, 2021), was four (Figure S3). The first event, which was present in 8 out of 10 replicates, involved gene flow between the ancestor of the Pakistani breeds and bezoar. The second and third events, present in all 10 replicates, involved gene flow between African and Spanish breeds in both directions. The fourth event, present in seven replicates, was an inter-generic cross between mountain goats and the ancestral node of most domestic goats in the Western European clade, along with wild goats from Youra and Creta.

The analysis by D-statistics (Figure 3; Figures S4 and S5, Tables S2 and S3) and f4-ratios (Figure 4) detected three sets of comparisons that consistently yielded significant Z-scores (at α = .01). One involved Caucasian tur and taxa of the bezoar-type clade (domestic goats, bezoar and markhor). The second involved taxa from the bezoar-type clade and Nubian ibexes. The third one involved European ibexes (Alpine and Iberian ibex) and taxa of the bezoar-type clade.

Details are in the caption following the image
D-statistic values for gene flow between Caucasian tur and taxa of the bezoar-type clade (domestic, bezoar and markhor). D-statistic design groups (P1, P2 and P3) are shown for testing gene flow between P2 and P3 (Z-scores are shown in the chart). The background colour indicates the perceived degree of non-compliance with the assumptions: green represents a low probability of non-compliance, yellow a medium degree and red a maximum probability of non-compliance.
Details are in the caption following the image
Heatmap of f-branch statistic (fb) values. The colour scale at right indicates values of alleles sharing between the tree branch on the vertical axis and the population on the horizontal axis. Cells in grey indicate comparisons that cannot be made.

In the comparisons testing introgression between Caucasian tur and taxa from the bezoar-type clade (which does not include markhor), the values of D-statistic were similar regardless of the choice of taxa from the bezoar-type group. In contrast, the values of D-statistics for this type of comparison (Caucasian tur and a taxon of true goats) changed significantly, when different taxa were used as P1 (recall that the candidates for introgression are P2 and P3, and P1 is the sister of P2). Arranging the comparisons according to the choice of P1 revealed a stepwise pattern with three levels (Figure 3). The different levels obviously correspond to the degree of conformity with the previously mentioned assumptions of the ABBA/BABA design, namely: (A.1) the lack of gene flow between P1 and P2, (A.2) the lack of gene flow between P1 and P3, and (A.3) a symmetric probability of incomplete lineage sorting in P1 and P2. The highest D-statistic values were obtained when P1 = Siberian ibex, a choice that provides a high level of adherence to (A.1), because this taxon is not sympatric with Caucasian tur and most taxa of the bezoar-type clade (Figure 3; Figure S5, Tables S2 and S3). In contrast, the comparisons where both P1 and P2 were from the bezoar-type clade had the lowest values of the D-statistic, several non-significantly different from zero. These comparisons may have a low conformity to (A.1) because in many cases the taxa belong to the same species. In addition, the consistent with (A.3) may be low if the comparison involves a species such as markhor, which is threatened and thus has a high rate of lineage loss, with a domestic breed that has a much higher population size and experience a post-domestication expansion. This pattern suggests that introgression between the Caucasian tur and the bezoar occurred before or during early domestication, and that a detectable amount of shared derived variation was segregated from the bezoar into all domestic populations, as reported by Daly et al. (2022). The alternative explanation would be that all domestics had remarkably similar levels of gene flow with the Caucasian tur, which is endemic to the Caucasus region, but this seems highly unlikely.

Other significant values of the D-statistics (ranging 0.22 to 0.28) were found in comparisons between both European ibexes (Alpine and Iberian) and taxa of the bezoar-type clade (Figure S5). As above, the values were dependent on the taxa used as P1, and once again, they showed a pattern better explained by ancestral introgression into bezoar rather than into all domestics in parallel. Similarly, the gene flow may also have involved the ancestor of the European ibexes rather than each species. The last group that had consistent and significant D-statistic values (0.23 to 0.26) included comparisons between Nubian ibex and any taxon of the bezoar-type clade (Figure S5). Once again, the pattern may indicate ancient introgression between Nubian ibexes and bezoar rather than with all domestic goats in parallel.

When we examined the possible effects of ascertainment bias in TreeMix-based inference, we found consistent estimation of topology in both the Goat 50 K array and the full set of SNPs extracted from the whole genome sequences (for the subsample for which we had the sequencing data) (see Data S1). The D-statistics also displayed similar values in the comparisons to which they were applied, with the major difference being in the significance level. In the analysis based on simulations, the ascertained and unascertained SNP datasets had low statistical power for detecting the migration event 1 (the oldest one). The two datasets also showed remarkably similar power for detecting intermediate-aged events 2 and 3. In both datasets, the very recent event 4 (of the simulated scenario) could not be detected. Altogether, these results suggest a little effect of ascertainment bias in both phylogenetic and gene flow inference of TreeMix and D-statistics.

In the analysis of f-branch statistics (Figure 4), we found similarly small amounts of shared alleles between an ancestral node of ibexes and all true goats with the markhor. This modest amount of shared variation also appeared between bezoar and markhor and all ibex species. Among domestic breeds, we also detected shared variation involving breeds from Greece, Southwest Asia (from Anatolia to Pakistan), Russia and Africa. This group (Southwest Asia, Russia and Africa) along the Spanish breeds, and their ancestors, also shared significant amounts of variation. A signature of gene flow in this type of comparison, specifically between African and Spanish breeds, was also recognized by TreeMix (Figure S3). Such patterns may be the result not only of recent gene flow, possibly associated with human migrations or trade, but also of an unusually high amount of ancestral segregated variation, or even of an admixed origin.

The three-population statistic (f3) for detecting admixed populations (which was applied only to domestic goat breeds) yielded significant results for populations that already showed high amounts of shared variation or traces of gene flow with populations that, in this analysis, appear as parental of the admixed candidate (Table S4). Notably, the Malagueña breed from Spain appears to be a hybrid of an African and other Spanish breeds, while the Chios and Peloponnese populations (from Greece) appear as admixtures of Pakistani and Western European breeds. Other significant results may not be the exact result of direct admixture, but of complex ancestry. In this category, we find the Kil and Kilis breeds from Türkiye, the Iranian breed, the Barki from Egypt, the Nubian from Sudan and the Dagestan Aboriginal from Russia.

4 DISCUSSION

4.1 Phylogeny of wild and domestic goats

Our complete dataset of 613 individuals from 14 species and subspecies, sampled over different locations in Europe, Asia, North Africa and North America provided a unique opportunity to assess genetic relationships between wild and domestic goats. Based on genome-wide SNP data, our phylogenetic reconstruction using TreeMix matched morphologically identified species and subspecies (Figure 2b) and had apparently little effects of ascertainment bias (see Data S1). The phylogenetic tree is also in agreement with two reported analyses of Y-chromosome variation that found the genus well separated into bezoar-type and ibex-type clades (Pidancier et al., 2006; VarGoats Consortium et al., 2022). However, the Y-chromosome phylogenies placed the Caucasian turs at a distal end of the Ibexes-type clade, while our phylogeny placed Caucasian turs at the base. Nevertheless, the presence of horizontal gene flow between Caucasian tur and the bezoar-type clade (which we detected with the D-statistics) could potentially bring the Caucasian tur to a more basal position of the tree if the correct topology was the one reported with Y-chromosome markers. However, we think that this is not the case, because the Y-chromosome technically constitutes a single locus, while our phylogeny is based on whole-genome SNP data. In support of that, the topology of the ibex-type clade also showed some disagreement with both the Y-chromosome and mtDNA phylogenies, although they agreed in the proximity of the Alpine and Iberian ibexes (Manceau et al., 1999; Pidancier et al., 2006). This close relationship is considered a consequence of an allopatric origin from a single Capra ancestor that colonized Europe in a single wave (Ureña et al., 2018).

In domestic goats, the phylogeny displayed a geographically congruent pattern that was largely consistent with published phylogenetic reconstructions (Colli et al., 2018), but our inclusion of less studied breeds provided novel insights. For instance, we found that Crete and Youra wild goats belong within the clade of European domestic goats (with 100% support), confirming them as feral goats rather than independent subspecies of bezoar. They also exhibited long branches that could be signs of demographic bottlenecks, something common in populations isolated on islands. However, in the case of the Cretan wild goats, a limited sampling could influence this. Even though this population is dispersed among various Aegean islands, we were able to sample five animals from a single place on Crete, four from a Berlin Zoo population and one from the island of Sapientza, whose population descended from the wild Cretan goats. These circumstances, that is, isolation, bottlenecks and sampling error, can also explain their lower values of MLH as compared to domestic goats. However, it cannot be ruled out that such populations are constituted by an admixture of early domestic and modern domestic goats that were introduced from Anatolia and the Levant. The feral hypothesis was supported by the analysis of mtDNA of Bar-Gal et al. (2002) who suggested that goats were introduced to Crete by pre-pottery Neolithic settlers and after posterior abandonment and isolation, they returned to the wild.

4.2 Gene flow between wild and domestic goats

The dynamic history of domestic goats and the existence of extensive regions of sympatric coexistence with wild goats provided ample opportunity for introgression, with some cases previously documented (Alasaad et al., 2012; Grossen et al., 2014; Hammer et al., 2008). In this study, D-statistics showed signatures of gene flow between domestic goats at large and Caucasian tur, European ibexes and Nubian ibexes (Figure 3; Figure S5, Tables S2 and S3). In all these cases, the similarity of values among domestic populations and the lack of sympatry between most domestic goats and these species suggest that introgression was ancient and occurred with bezoar (or even a bezoar ancestor) from which much of the variation was segregated into all domestics. This scenario is likely because the short evolutionary time since domestication and the expansion of the domestic goat population in Eurasia are factors that prevent the purging of ancestral genetic variation. On the other hand, Caucasian tur specimens have been found in archaeological settlements along with the remains of exclusively domestic animals, including goats (Sauer et al., 2015). Introgression between bezoar or early domestics and a lineage related to the Caucasian tur (called ‘Taurasian tur’) was, in fact, detected by Daly et al. (2022). Moreover, gene flow between Caucasian turs and bezoars has been reported using whole-genome sequencing data from modern and ancient specimens (Zheng et al., 2020).

Another set of relevant signatures of introgression involved European ibexes (Alpine and Iberian) and taxa of the bezoar-type clade. The pattern of D-statistics also suggests ancestral introgression between bezoar and the ancestor of European ibexes. Although we cannot discount completely the presence of gene flow between ibexes and sympatric domestic goats (e.g. Alpine goat with Alpine ibex or Iberian goat with Iberian ibex), there was no clear signal of it in our particular sample. A larger sample size and whole genome sequencing data may likely detect such introgression signatures if they exist.

Other introgression events detected by TreeMix (Figure S3) involved Pakistani breeds and bezoar which were detected by f-branch statistics as well (Figure 4). Although previous reports supported these events, due to the phylogenetic proximity of bezoar and domestic goats, it is possible that incomplete lineage sorting and alternative scenarios are involved in these results. For example, the presence of genetic structure in bezoar during the early stages of domestication, combined with demographic fluctuations, could lead to a differential segregation of genetic variation in eastern and western domestic goat breeds (Ahmad & Nabi, 2022). The ancestral variation that was present in eastern but not western breeds or vice versa could produce false positives with D-statistics and other tests. This can also be confirmed or disregarded with a more extensive regional sampling and genome-wide sequencing data.

A surprising event of gene flow between mountain goats and the ancestor of the western domestic clade was detected with TreeMix (Figure S3). Since mountain goats inhabit the American continent and the probability of contact with a member of the Capra genus is virtually zero, we can safely assume that this is an artefact of non-missing genotypes possibly caused in a similar way than the spurious gene flow between the Schipperke and the Poodle dog breed (Fitak, 2021). Another reason for this conclusion is that the statistical power to detect very old introgression events could be very low for TreeMix and other methods, regardless of ascertainment bias (see Data S1).

4.3 Gene flow among domestic goat breeds

Domestic goat populations from geographic areas near the original region of domestication are expected to have higher ancestral diversity than populations from distant areas. This was true for Iranian and Turkish breeds, which shared high amounts of genetic variation with breeds from Russia, Greece and Spain. This is more likely due to ancestral relationships rather than contemporary gene flow, in a similar way than the shared variation between bezoar and all domestic goats is due to bezoar being the ancestor of all domestics. However, a genetic affinity between western Asian (including Pakistani populations) and Greek populations has also been reported in cattle (Papachristou et al., 2020). Another event of gene flow between domestic breeds, from Soviet Mohair into Dagestan aboriginal, detected by f-branch statistics (Figure 4), has been reported previously (Deniskova et al., 2021). These breeds have also been reported to present extensive zones of hybridization with other breeds in the Caucasus (Selionova et al., 2020).

Our analyses also provided evidence of gene flow between Egyptian and Spanish goat breeds (Figure 4; Figure S3, Table S4). Considering that other studies using various markers (including a goat SNP array) have also reported gene flow between North African and Iberian populations (Colli et al., 2018; Manunza et al., 2016; Pereira et al., 2009), it is likely that gene flow between those populations has been long-lasting, intense or recent as reported by El Moutchou et al. (2018). Such an intercontinental gene flow could be the result of domestic goats' expansion outside their centre of domestication. In Europe, Neolithic farmers expanded along two main routes: the Danube River and the Mediterranean coast (Skoglund et al., 2012). The phylogenetic closeness of Spanish and Alpine breeds suggests that their ancestors were introduced by Neolithic farmers that followed the Mediterranean route (see Figure 2b; Figure S2A). In Africa, Neolithic farmers entered the continent from the northeast via the Levant and spread along the Mediterranean coast, but in this case from the southern edge (Pinhasi et al., 2012). Since both shores of the Mediterranean (North African and European) end at the Strait of Gibraltar, which is also the point of minimum distance between Europe and Africa, it is reasonable to conclude that gene flow between African and Spanish breeds was enabled by human migrants crossing the Gibraltar Strait. However, the presence of multiple contact points at different times is a more likely possibility if we consider the complexity of human migrations in historical and prehistoric times.

Our study, based on whole genome SNP information, offered new insights into the phylogeny of the genus Capra. We further refined the model of the evolutionary history of the genus by inferring patterns of gene flow between wild and domestic goats. The results observed here confirm some patterns reported in previous studies that were solely based on mtDNA or Y-chromosome markers. Interestingly, we did not find clear signatures of recent introgression between domestic goats and wild ibexes despite previous reports. However, sporadic or ancient introgression was captured and probed to be a relevant source of variation shaping the genomic makeup of both wild and domestic goats.

AUTHOR CONTRIBUTIONS

IM and SH designed the study. EH, MS, AA, DP, PK, JS, PB, PL, NiP, IB and NZ coordinated and performed DNA sampling. NeP, AD and MU performed analyses. GB and SR contributed to the manuscript by providing genotypes of some wild and domesticated goats. NeP, MU and ESC wrote the first draft of the manuscript. All authors contributed to writing and reviewing the final manuscript.

ACKNOWLEDGEMENTS

The authors are grateful to Sascha Wellig (Dienststelle für Jagd, Fischerei und Wildtiere, Sion Switzerland) and Dr. Christine Gohl (The Munich Zoo Hellabrunn), as well as Irena Furlan and Pavel Kvapil (The Zoo Ljubljana) for providing the samples of Alpine ibex and mountain goat. Also, we are grateful to Bendersky E.V. and the Mountain Hunters Club for providing part of the samples. In the end, we thank the Greek State Scholarships Foundation (I.K.Y.) for supporting a PhD scholarship. Open Access funding enabled and organized by Projekt DEAL.

    FUNDING INFORMATION

    The work was in part supported by the RSF within project 21-66-00007 and the RMSHE grant 075-15-2021-1037. Furthermore, we thank the Slovenian Research Agency for funding through the programme grant no. P4-0220, project No. J4-1768, and Young investigator grant to NeP. We are also thankful to DAAD funding programs for supporting our work through the project number BI-DE/20-21-003, DAAD ID-57514909 and DAAD ID-57418683.

    CONFLICT OF INTEREST STATEMENT

    Authors declare that they have no conflict of interest.

    ETHICS STATEMENT

    The animals used for the first time in this study were not bred, killed or sampled specifically for the needs of our project.

    BENEFIT-SHARING STATEMENT

    The scientists that provided genetic samples from different countries are part of an active collaboration network. They are included as co-authors and contributed to the analysis, discussion and writing of the manuscript. Their feedback and contributions had a value for conservation or possible usage of local diversity. The results of research have and will be shared with the communities from which the samples originated and the broader scientific community. The study addresses a priority concern that contributes to the conservation of organisms being studied. Our research group is committed to international scientific partnerships, and all samples are collected by local partners according to local regulations.

    DATA AVAILABILITY STATEMENT

    The 50K dataset of SNPs produced in this study can be accessed in plink format on the figshare website (https://figshare.com/) via accession number: 10.6084/m9.figshare.24190413 or directly by this link: https://figshare.com/articles/dataset/PLINK_files_ped_and_map_of_wild_and_domestic_goat_species/24190413. The dataset contains published data: Burren et al. (2016): Dryad; https://doi.org/10.5061/dryad.q1cv6; Chebii et al., 2020: NCBI; Project no. PRJNA674751; Chen et al., 2019: NCBI; Project no. PRJNA438286; Colli et al., 2018: Dryad; https://doi.org/10.5061/dryad.v8g21pt; Deniskova et al., 2021: Figshare; https://figshare.com/articles/dataset/SNP-based_genotyping_provides_insight_into_the_West_Asian_origin_of_Russian_local_goats/14706429; Denoyelle et al., 2021: ENA; Project no. PRJEB37507; Grossen et al., 2020: NCBI; Project no. PRJNA514886; Hassan et al., 2018: not publicly available; https://doi.org/10.1007/s10344-018-1212-z; Pogorevc et al., 2021: NCBI GEO; Accession no. GSE176157; Rahmatalla et al., 2017: Supplemented to the article; https://doi.org/10.1186/s12863-017-0553-z.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.