Understanding the molecular basis of repeatedly evolved phenotypes can yield key insights into the evolutionary process. Quantifying gene flow between populations is especially important in interpreting mechanisms of repeated phenotypic evolution, and genomic analyses have revealed that admixture occurs more frequently between diverging lineages than previously thought. In this study, we resequenced 47 whole genomes of the Mexican tetra from three cave populations, two surface populations and outgroup samples. We confirmed that cave populations are polyphyletic and two Astyanax mexicanus lineages are present in our data set. The two lineages likely diverged much more recently than previous mitochondrial estimates of 5–7 mya. Divergence of cave populations from their phylogenetically closest surface population likely occurred between ~161 and 191 k generations ago. The favoured demographic model for most population pairs accounts for divergence with secondary contact and heterogeneous gene flow across the genome, and we rigorously identified gene flow among all lineages sampled. Therefore, the evolution of cave-related traits occurred more rapidly than previously thought, and trogolomorphic traits are maintained despite gene flow with surface populations. The recency of these estimated divergence events suggests that selection may drive the evolution of cave-derived traits, as opposed to disuse and drift. Finally, we show that a key trogolomorphic phenotype QTL is enriched for genomic regions with low divergence between caves, suggesting that regions important for cave phenotypes may be transferred between caves via gene flow. Our study shows that gene flow must be considered in studies of independent, repeated trait evolution.

1 INTRODUCTION

Repeated adaptation to similar environments offers insight into the evolutionary process (Agrawal, 2017; Gompel & Prud'homme, 2009; Losos, 2011; Rosenblum, Parent, & Brandt, 2014; Stern, 2013; Stern & Orgogozo, 2009). Predictable phenotypes are likely when repeated evolution occurs through standing genetic variation and/or gene flow, such that the alleles are identical by descent, and when populations experience shared, strong selection regimes (Rosenblum et al., 2014). Understanding repeated evolution, however, requires an understanding of the basic parameters of the evolutionary process, including how long populations have diverged, how many independent phenotypic origins have occurred, the extent of gene flow between populations and the strength of selection needed to shape phenotypes (Roesti, Gavrilets, Hendry, Salzburger, & Berner, 2014; Rosenblum et al., 2014; Rougemont et al., 2017; Stern, 2013; Welch & Jiggins, 2014).

The predictable phenotypic changes observed in cave animals offer one of the most exciting opportunities in which to study repeated evolution (Elmer & Meyer, 2011). Cave animals also offer advantages over many systems in that the direction of phenotypic change is known (surface → cave) and coarse selection pressures are defined (e.g., darkness and low-nutrient availability). The cavefish (Astyanax mexicanus) in northeastern Mexico have become a central model for understanding the evolution of diverse developmental, physiological and behavioural traits (Keene, Yoshizawa, & McGaugh, 2015). Many populations of Astyanax mexicanus exhibit a suite of traits common to other cave animals including reduced eyes and pigmentation (Borowsky, 2015; Protas & Jeffery, 2012). In addition, many populations possess behavioural and metabolic traits important for survival in dark, low-nutrient environments (Aspiras, Rohner, Martineau, Borowsky, & Tabin, 2015; Bibliowicz et al., 2013; Duboué, Keene, & Borowsky, 2011; Jaggard et al., 2017, 2018; Jeffery, 2001, 2009; Protas et al., 2008; Riddle et al., 2018; Salin, Voituron, Mourin, & Hervant, 2010; Varatharasan, Croll, & Franz-Odendaal, 2009; Yamamoto, Byerly, Jackman, & Jeffery, 2009; Yoshizawa, Gorički, Soares, & Jeffery, 2010). Over 30 populations of cavefish are documented (Espinasa, Rivas-Manzano, & Pérez, 2001; Mitchell, Russell, & Elliott, 1977), and conspecific surface populations are a proxy for the ancestral conditions to understand repeated adaptation to the cave environment. Thus, A. mexicanus offers an excellent opportunity to evaluate the roles of history, migration, drift, selection and mutation to the repeated evolution of a convergent suite of phenotypes.

Despite evidence for the repeated evolutionary origin of cave-associated traits in A. mexicanus, the timing of cave invasions by surface lineages is uncertain, and the extent of genetic exchange between cavefish and surface fish is under debate (Bradic, Beerli, León, Esquivel-Bobadilla, & Borowsky, 2012; Bradic, Teotónio, & Borowsky, 2013; Coghill, Hulsey, Chaves-Campos, García de Leon, & Johnson, 2014; Fumey et al., 2018; Gross, 2012; Hausdorf, Wilkens, & Strecker, 2011; Ornelas-García, Domínguez-Domínguez, & Doadrio, 2008; Porter, Dittmar, & Pérez-Losada, 2007; Strecker, Bernatchez, & Wilkens, 2003; Strecker, Faúndez, & Wilkens, 2004; Strecker, Hausdorf, & Wilkens, 2012). High amounts of gene flow between cave and surface populations and among cave populations may complicate conclusions regarding repeated evolution (Mallet, Besansky, & Hahn, 2016; Ornelas-García & Pedraza-Lara, 2015). For instance, cavefish populations are polyphyletic (Bradic et al., 2012, 2013; Coghill et al., 2014; Gross, 2012; Ornelas-García et al., 2008; Strecker et al., 2003, 2012) which is consistent with repeated evolution. Yet, this pattern has been previously hypothesized as being consistent with a single invasion of the cave system, subterranean spread of fish and substantial gene flow of individual caves with their geographically closest surface population (suggested by Coghill et al., 2014; Espinasa & Borowsky, 2001). Alternatively, gene exchange among caves could result in a shared adaptive history even if cave populations were founded independently. Thus, additional work is needed to understand the demographic history of these populations and implications for the evolutionary process.

Here, we conduct an extensive examination of the gene flow between cave populations and between cave and surface populations of A. mexicanus to understand repeated evolutionary origins of cave-derived phenotypes and the potential for gene flow to enhance or impede adaptation to caves. Since we employ whole-genome resequencing as opposed to reduced representation methods, we were able to calculate genomewide absolute measures of divergence (d_XY), compare to relative measures (pairwise F_ST) (reviewed in Cruickshank & Hahn, 2014; Ellegren, 2014; Lowry et al., 2016), and demonstrate that pairwise F_ST is predominantly driven by heterogeneous diversity across populations which obscured accurate inferences of divergence and gene flow among populations in past studies (Charlesworth, 1998). Our analyses reveal gene flow between cave populations is more extensive than previously appreciated, cave and surface populations exchange alleles (Wilkens & Strecker, 2003), and surface and cave populations have diverged recently. We show that given these demographic parameters, repeated evolution of cave phenotypes may reflect the action, rather than the relaxation of natural section. Additionally, we present evidence that at least one region of the genome important for cave-derived phenotypes may be transferred between caves, suggesting that gene flow among caves may play a role in the maintenance and/or origin of cave phenotypes.

2 METHODS

2.1 Sampling, DNA extractions, and sequencing

We sampled five populations of Astyanax mexicanus cave and surface fish from the Sierra de Guatemala, Tamaulipas, and Sierra de El Abra region, San Luis Potosí, Mexico: Molino, Pachón, and Tinaja caves and the Rascón and Río Choy surface populations (Figure 1). Two main lineages of cave and surface populations are often referred to as “new” and “old” in reference to when the lineages presumably reached northern Mexico, each with independently evolved cave populations. Populations in Molino cave and Río Choy are considered “new” lineage fish, and the populations of Pachón cave and Tinaja cave are typically classified as “old” lineage populations (Bradic et al., 2012; Coghill et al., 2014; Dowling, Martasian, & Jeffery, 2002; Ornelas-García et al., 2008). These cave populations were the focus of our study because they are the most commonly used cave populations for laboratory studies.

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

(a) Map of caves (blue), surface populations (white). (b) Location of *A. mexicanus* range (peach) with sampled area in red box and *A. aeneus* range (green), (c) Location in North and Central America of the focal sampled area. Map adapted from (Ornelas-García & Pedraza-Lara, 2015). (d) Phylogenetic inference from the largest scaffolds that comprised approximately 50% of the genome using a maximum-likelihood tree search and 100 bootstrap replicates in RAxML v8.2.8 (Stamatakis, 2014). π is a genomewide average, not excluding admixed individuals. Old lineage and new lineage populations delineated by past studies are strongly supported. Branch-lengths correspond to phylogenetic distance and confirm cave populations (especially Molino cave) are less diverse than surface populations

Rascón was selected because cytochrome b sequences are most similar to old lineage cave populations (Ornelas-García et al., 2008). Rascón is part of the river system of Rio Gallinas, which ends at the 105 m vertical waterfall of Tamul into Rio Santa Maria/Tampaon. It is hypothesized that new lineage surface fish could not overcome the 105 m waterfall and could not colonize the Gallinas river (e.g., Rascón population). Thus, Rascón likely has remained cut-off from nearby surface streams inhabited by new lineage populations, and it, thus, was an important surface locality to include. Most other surface populations are thought to be of “new” origin, exhibit low divergence between each other, and be panmictic (Bradic et al., 2012); thus, sampling Río Choy is an initial proxy for sampling surface populations range-wide.

Fin clips were collected from fish in Spring 2013 from Pachón cave, Río Choy (surface), and a drainage ditch near the town of Rascón in San Luis Potosí, Mexico. Samples from Molino cave were collected by W. Jeffery, B. Hutchins and M. Porter in 2005, and Pecos drainage in Texas in 1994 by W. Jeffery and C. Hubbs. Samples from Tinaja cave were collected in 2002 and 2009 by R. Borowsky.

We complemented our sequencing of these populations with four additional individuals. First, we sequenced an A. mexicanus sample from a Texas surface population. While this population is not of direct interest to this study, it has been the focus of many Quantitative Trait Loci (QTL) studies (e.g., O'Quin et al., 2015), and therefore, it is evolutionary relationship to our populations is of interest. Second, we aimed to sequence an outgroup to polarize mutations in A. mexicanus. To do so, we first sequenced two samples of a close congener Astyanax aeneus (Mitchell et al., 1977; Ornelas-García & Pedraza-Lara, 2015) from Guerrero, Mexico. However, despite being separated from the sampled Astyanax mexicanus populations by the Trans-Mexican Volcanic Belt (Mitchell et al., 1977; Ornelas-García & Pedraza-Lara, 2015), we found a history of gene flow between Astyanax aeneus and Astyanax mexicanus. We therefore sequenced a more distant outgroup, the white skirt tetra (long-finned) Gymnocorymbus ternetzi. Gymnocorymbus ternetzi was used only for polarizing 2D site frequency spectra.

DNA was extracted using the Genomic-Tip Tissue Midi kits and DNEasy Blood and Tissue kit (Qiagen). We performed whole-genome resequencing with 100 bp paired-end reads on an Illumina HiSeq2000 at The University of Minnesota Genomics Center. Samples were prepared for Illumina sequencing by individually barcoding each sample and processing with the Illumina TruSeq Nano DNA Sample Prep Kit using v3 reagents. Five barcoded samples were pooled and sequenced in two lanes. In total, 45 samples were sequenced across 18 lanes. We resequenced nine Pachón cavefish, ten Tinaja cavefish, nine Molino cavefish, six Rascón surface fish, nine Río Choy surface fish and two Astyanax aeneus samples. We also made use of the Astyanax genome project to add another sample from the Pachón population, and obtain genotype information for the reference sequence. For this reference sample, we used the first read (R1) from all of the 100 bp paired-end reads that were sequenced using an Illumina HiSeq2000 and aligned to the reference genome (McGaugh et al., 2014). The Texas population and the white skirt tetra were barcoded, pooled and sequenced across two lanes of 125 bp paired-end reads from a HiSeq2500 in High-Output run mode using v3 reagents.

For all raw sequences, we trimmed and cleaned sequence data with trimmomatic v0.30 (Lohse et al., 2012) and cutadapt v1.2.1 (Martin, 2011) using the adapters specific to the barcoded individual, allowing a quality score of 30 across a 6 bp sliding window, and removing all reads <30 nucleotides in length after processing. When one read of a pair failed QC, its mate was retained as a single-end read for alignment. Post-processing coverage statistics were generated by the fastx toolkit v0.0.13 ( http://hannonlab.cshl.edu/fastx_toolkit/) and ranged from 6.8- to 11.9-fold coverage (mean = 8.87-fold coverage; median = 8.65-fold coverage Table 1, Supporting Information Tables S1 and S2), assuming a genome size estimate of 1.19 Gb (McGaugh et al., 2014) for the Illumina Hi-Seq 2000 samples. For the Pachón reference genome sequence, which was excluded from the measures above, the coverage was 16.68-fold. All new Illumina sequence reads were submitted to NCBI's Sequence Read Archive (Supporting Information Table S3). We expect some heterozygosity dropout with this relatively low level of coverage, but we sought to balance number of samples and depth of coverage in a cost-effective manner. Heterozygosity dropout would lower our diversity estimates, and if directional (e.g., due to reference sequence bias) could increase divergence between populations and lower estimates of gene flow. As detailed below, our data estimate more recent divergence times and higher estimates of gene flow than past studies.

Table 1. Basic statistics for population level resequencing. Coverage statistics are for reads cleaned of adapters, filtered for quality and aligned to the Astyanax mexicanus reference genome v1.0.2. Two individuals of Astyanax aeneus, one Texas surface Astyanax mexicanus, and white long-finned skirt tetra (Gymnocorymbus ternetzi) were also sequenced

Population	N	Aligned coverage mean (range)
Pachón – cave	9 + reference	9.94 (7.65, 17.74)
Tinaja – cave	10	10.05 (7.27, 12.35)
Rascón – surface	6	10.90 (8.63, 13.15)
Molino – cave	9	9.39 (7.47, 12.32)
Río Choy – surface	9	9.27 (8.38, 10.26)

2.2 Alignment to reference and variant calling

The Astyanax mexicanusgenome v1.0.2 (McGaugh et al., 2014) was downloaded through NCBI genomes FTP. Alignments of Illumina data to the reference genome were generated with the BWA-mem algorithm (Li, 2013) in bwa-0.7.1 (Li & Durbin, 2009, 2010). Both genome analysis toolkit v3.3.0 (GATK) and picard v1.83 ( http://broadinstitute.github.io/picard/) were used for downstream manipulation of alignments, according to GATK Best Practices and forum discussions (Auwera et al. 2013; DePristo et al., 2011; McKenna et al., 2010). Alignments of paired-end and orphaned reads for each individual were sorted and merged using Picard. Duplicate reads that may have arisen during PCR were marked using Picard's MarkDuplicates tool and filtered out of downstream analyses. GATK's IndelRealigner and RealignerTargetCreator tools were then used to realign reads that may have been errantly mapped around indels (insertions/deletions). Additional details are in Supporting Information Methods.

The haplotypecaller tool in gatk v3.3.0 was used to generate GVCFs of genotype likelihoods for each individual. The genotypegvcfs tool was used to generate a multi-sample variant call format (VCF) of raw variant calls for all samples. Hard filters were applied separately to SNPs and indels/mixed sites using the variantfiltration and selectvariants tools (Supporting Information Table S4). Filtering variants was performed to remove low confidence calls from the data set. The filters removed calls that did not pass thresholds for base quality, depth of coverage and other metrics of variant quality (Supporting Information Table S4).

Alleles with 0.5 frequency appeared to be overrepresented in the site frequency spectra, and the depth of coverage for alleles with 0.5 frequency was greater than the coverage at other frequencies. Upon examination, every individual was heterozygous and these alleles occurred in small tracts throughout the genome. We concluded these were likely paralogous regions (teleost fish have an ancient genome duplication, Hoegg, Brinkmann, Taylor, & Meyer, 2004; Meyer & Van de Peer, 2005). Molino was the most severely impacted population, likely due to Molino's low diversity allowing for collapsing of paralogs (see below). To be conservative, we identified sites where 100% of individuals in any population were heterozygotes and excluded these sites in all analyses. We still observed slight inflation at 0.5 frequency even after our heterozygosity filtering, and these are probably instances of collapsed paralogs where our criterion of heterozygosity in every genotyped individual is not met.

We generated a VCF file with variant and invariant sites, and subset it to include biallelic only SNPs where applicable. We excluded sites that were defined as repetitive regions from a Repeat Modeler analysis in McGaugh et al. (2014). Next, we removed sites that had less than six individuals genotyped in all populations. We also scanned for indels in the VCF file and removed the indel as well as 3 bp ± around the indel. In total, 171,841,976 bp were masked from downstream analyses. Lastly, all sites that were tri- or tetra-allelic were removed. Thus, sites were either invariant or biallelic. To reduce issues of non-independence induced by linkage disequilibrium, we analysed a thinned set of biallelic SNPs by retaining a single SNP in each non-overlapping window of 150 SNPs for the ADMIXTURE, TreeMix, and F₃ and F₄ analyses, resulting in approximately 119,000 SNPs for those analyses.

2.3 Population phylogeny and divergence

We generated a phylogenetic reconstruction to understand population relationships using 47 sampled individuals (excluding white skirt tetra, but including A. aeneus and Texas A. mexicanus). The population phylogeny estimation was implemented in raxml v8.2.8 with the two A. aeneus specified as the outgroups. We converted the VCF to fasta format alignments by passing through the mvf format using mvftools v3 (Pease & Rosenzweig, 2015). This was done to preserve ambiguity codes at sites polymorphic in individuals. Sites with greater than 40% missing data were removed using trimal v1.2 (Capella-Gutiérrez, Silla-Martínez, & Gabaldón, 2009). We performed 100 rapid bootstrapping replicates followed by an ML search from two separate parsimony trees. For computational efficiency, we implemented the GTRCAT model, though we note that inferences using a smaller subset of the genome and the GTRGAMMA model recover the same topology as presented in Figure 1. The ML tree with bootstrap support was drawn using the ape package v3.5 (Paradis, Claude, & Strimmer, 2004) in r v3.2.1 (R Core Team, 2018) using the A. aeneus samples as the outgroup; however, our results are consistent even if the outgroup is left unspecified and this decision is justified by the observation that A. aeneus is more distant from all A. mexicanus populations than any are to one another, as measured by the mean number of pairwise sequence differences.

2.3.1 Intra- and interpopulation average pairwise nucleotide differences

Average pairwise nucleotide diversity values (π) reflect the history of coalescence events within and among populations (Hudson, 1990; Nei, 1987) and are directly informative about the history and relationships of a sample. Here, we use the following notation for inter- and intrapopulation average pairwise nucleotide diversity values: Intrapopulation values are designated with π, while interpopulation values are designated with d_XY where X and Y are different populations (e.g., d_{Rascón-Pachón}). Pairwise nucleotide diversity estimates for fourfold degenerate sites were calculated for 46 individuals (excluding Texas A. mexicanus and white skirt tetra) from phased genomic data using the VCF file with invariant and variant sites unless otherwise noted. Additional details for each analysis (e.g., window size, site class used) are given Supporting Information Methods. We also performed windowed estimates along the entire genome to understand the fine-scale apportionment of ancestry.

We use these levels of diversity and divergence to estimate the difference in coalescent times between and within populations. In the absence of gene flow, this excess between populations can be translated into an estimate of divergence time (d_XY − π_anc = 2μT (Brandvain, Kenney, Flagel, Coop, & Sweigart, 2014; Hudson, Kreitman, & Aguadé, 1987), where T is the population split time, and using a diversity in surface population as a stand in for π_anc). After removing recent hybrids identified by ADMIXTURE, we use this approach to estimate this excess coalescent time between pairs of populations. However, given extensive evidence for gene exchange (below), this estimate will be more recent than the true divergence time (Mallet et al., 2016), which we estimate with model-based approaches below.

In this and all other estimates of divergence time, below, we assume a neutral mutation rate of 3.5 × 10⁻⁹/bp/generation which is estimated using parent–offspring trios in cichlids (Malinsky et al., 2017). We know little about the age-specific reproductive outputs for Astyanax in the wild. However, in the laboratory, Astyanax is sexually mature as early as 6 months under optimal conditions, though this varies across laboratories (Borowsky, 2008a; Jeffery, 2001). Since conditions in the wild are rarely as optimal as in the laboratory, we assume generation interval (Fenner, 2005) is 1 year, though, our general conclusions are consistent if the generation interval is longer.

2.4 Genomewide tests for gene flow

2.4.1 ADMIXTURE

We estimated admixture (cluster membership) proportions for the individuals comprising the five populations studied, as well as A. aeneus samples, using admixture v1.3.0 (Alexander & Lange, 2011; Alexander, Novembre, & Lange, 2009). We conducted 10 independent runs for each value of K (the number of ancestral population clusters) from K = 2 to K = 6 and present results for the run with the K = 6 (i.e., the number of sampled biological populations).

2.4.2 F₃ and F₄ statistics

After removing recent hybrids, we used F₃ and F₄-statistics (as implemented in the threepop and fourpop programs packaged with treemix v1.12 (Pickrell & Pritchard, 2012)) to test for historical gene flow. For computational feasibility, we calculated a standard error for each statistic using a block jackknife with blocks of 500 SNPs, assessing significance by a Z-score from the ratio of F₄ to its standard error.

The F₃ statistic (Peter, 2016; Reich, Thangaraj, Patterson, Price, & Singh, 2009), represented as F₃ (X;A,B), is calculated as E[(p_X − p_A)(p_X − p_B)], where X is the population being tested for admixture, A and B are treated as the source populations, and p_X, p_A, p_B are the allele frequencies. Without introgression, F₃, is positive, and a negative value occurs when this expectation is overwhelmed by a non-tree-like history of three populations. Importantly, complex histories of mixture in populations A and/or B do not result in significantly negative F₃ values (Patterson et al., 2012; Reich et al., 2009).

F₄ tests assess “treeness” among a quartet of populations (Reich et al., 2009). F₄(P1,P2;P3,P4) is calculated as E[(p₁ − p₂)(p₃ − p₄)], and serves as a powerful test of introgression when P1, P2 and P3, P4 are sisters on an unrooted tree. A significantly positive F₄ value means that P1 and P3 are more similar to one another than expected under a non-reticulate tree—a result consistent with introgression between P1 and P3, or between P2 and P4. Alternatively, a significantly negative F₄ value is consistent with introgression between P1 and P4 or between P2 and P3. Regardless of the sign of this test, F4 values may reflect introgression events involving sampled, unsampled, or even extinct populations.

2.4.3 TreeMix

We used the treemix program v1.12 (Pickrell & Pritchard, 2012) to visualize all migration events after removing recent hybrids. Rather than representing population relationships as a bifurcating tree, treemix models population relationships as a graph in which lineages can be connected by migration edges. We used the same thinned biallelic SNPs as above. As with F₃ and F₄, we removed recent hybrids from this analysis. We first inferred the maximum-likelihood tree and then successively added single migration events until the proportion of variance explained by the model plateaued (Pickrell & Pritchard, 2012).

2.5 Demographic modelling

We modelled the demographic history of pairs of populations to further elucidate their relationships and migration rates. To conduct demographic modelling, we generated unfolded joint site frequency spectra (2D SFS) from the invariant sites VCF using a custom Python script. Detailed explanation of parameters and a user-friendly guide is available here: https://github.com/TomJKono/CaveFish_Demography/wiki.

White skirt tetra (Gymnocorymbus ternetzi) was used as the estimated ancestral state for all 10 pairwise combinations of Río Choy surface, Molino cave, Pachón cave, Rascón surface and Tinaja cave. Recently admixed individuals were excluded from the comparisons. To reduce the effect of paralogous alignment from the ancient teleost genome duplication, sites at 50% frequency in both populations were masked from the demographic modelling analysis. Sites were excluded if they were heterozygous in the representative outgroup sample or had any missing information in either of the populations being analysed. In addition, sites were excluded due to indels or repetitive regions as described above. Aside from these exclusions, all other sites, including invariant sites were included in all 2D SFS and the length of these sites make up locus length.

A total of seven demographic models were fit to each 2D SFS to infer the timing of population divergence, effective migration rates and effective population sizes using ∂a∂i 1.7.0 (Gutenkunst, Hernandez, Williamson, & Bustamante, 2009). The seven models were derived from previously published demographic modelling analysis (Tine et al., 2014). Briefly, the seven models are as follows: SI—strict isolation, SC—secondary contact, IM—isolation with migration, AM—ancestral migration, SC2M—secondary contact with two migration rates, IM2M—isolation with migration with two migration rates and AM2M—ancestral migration with two migration rates (see figure S9 of Tine et al., 2014). Two migration rates within the genome appeared to fit the data better in previous work (Tine et al., 2014), as this approach allows for heterogeneous genomic divergence (M which is most often the largest migration rate within the genome, Mi which is usually the lowest migration rate within the genome). Descriptions of the parameters and parameter starting values are given in Supporting Information Table S5 and Supporting Information Methods. For each replicate, Akaike information criterion (AIC) values for each model were converted into Akaike weights. The model with the highest mean Akaike weight across all 50 replicates was chosen as the best-fitting model (Rougeux, Bernatchez, & Gagnaire, 2017; Supporting Information Table S6; similar to Wagenmakers & Farrell, 2004).

The best-fitting demographic model for each 2D SFS comparison was used to generate estimates of the population parameters. Scaling from estimated parameters to real-value parameters was done assuming a mutation rate (μ) of 3.5 × 10⁻⁹ per bp per generation (Malinsky et al., 2017) and a locus length (L) of the number of sites used to generate the 2D SFS. For effective population size estimates, we estimated the reference population size, Nref, as the mean “theta” value from all 50 replicates of the best-fitting model multiplied by 1/(4 μL). The effective population sizes of the study populations were then estimated as a scaling of Nref. To estimate the per-generation proportion of migrants, we scaled the mean total migration rates from ∂a∂i by 1/(2Nref). Total times since divergence for pairs of populations were estimated as the sum of the split time (“Ts”) and time since secondary contact commenced (“Tsc”) estimates for each population pair.

2.6 Modelling of selection needed for cave alleles to reach high frequency

With demographic estimates provided by our whole-genome sequencing and ∂a∂i, we estimated the selection coefficients needed to bring alleles associated with the cave phenotype to high frequency. We implemented the 12-locus additive alleles model by Cartwright, Schwartz, Merry, and Howell (2017) with the parameters estimated for Molino cave, as this is the population where selection would need to be strongest to overcome the effects of drift. We estimated the one-way mutation rate of loci (μ) to be on the order of 1 × 10⁻⁶ (c.f. 3.5 × 10⁻⁹/bp/generation * 1170 bp, which is the median gene length across the genome). Other parameters were N = 7,335 (Molino Ne), h = 0.5 (additive alleles), k = 12 (number of loci) and Q = 0.1 (surface allele frequency of cave-favoured alleles). The simulation model was adapted from Cartwright et al. (2017) with the addition that the cave population was isolated for a period of time before becoming connected again with the surface. Consistent with our demographic models for Molino and Río Choy, migration rates spanned between 10⁻⁶ and 10⁻⁵ with 91,000 generations of isolation followed by 71,000 generations of connection. We also conducted simulations with the parameters from Tinaja cave as this is the cave population where selection would need to be weakest to overcome the effects of drift. All parameters were the same as above except N = 30,522 (Tinaja Ne), and 170,000 generations of isolation were followed by 20,000 generations of connection.

2.7 Candidate regions introgressed between caves

To identify regions of the genome that were likely transferred between cave populations or between cave and surface populations and were linked to cave-associated phenotypes, we implemented an outlier approach incorporating pairwise sequence differences and F_ST. To determine potential candidate regions, we utilized 5 kb d_XY windows using all biallelic sites, not just fourfold degenerate, since so few sites would be available. We first phased the biallelic variant sites for all samples using beagle (version 4.1; Browning & Browning, 2007). We then determined the number of invariant sites (from the full genome VCF) and variant sites in each 5 kb window to calculate average pairwise diversity. We then identified 5 kb regions in which d_XY was in the lower 5% tail of the genomic distribution of d_XY for all three pairwise combinations of cave populations, but substantial divergence with either surface population as measured by F_ST or d_XY. For F_ST outliers, we required that π per gene for both surface fish populations must be greater than the lowest 500 π values for genes across the genome. This requirement protects, in part, against including regions that are low diversity across all populations due to a feature of the genome.

To put the results in context of phenotypes mapped to the genome, we created a database of prior QTL from several key studies (Kowalko, Rohner, Linden et al., 2013; Kowalko, Rohner, Rompani et al., 2013; O'Quin, Yoshizawa, Doshi, & Jeffery, 2013; Protas, Conrad, Gross, Tabin, & Borowsky, 2007; Protas et al., 2008; Yoshizawa, Yamamoto, O'Quin, & Jeffery, 2012; Yoshizawa et al., 2015) and used overlapping markers between studies to position QTL relative to the linkage map in O'Quin et al. (2013). For markers in (Kowalko, Rohner, Rompani et al., 2013), we used blastn to place the marker on a genomic scaffold and placed the QTL from (Kowalko, Rohner, Rompani et al., 2013) in our database as locating to the entire scaffold. Our qtl database is given in Supporting Information Table S7.

3 RESULTS

3.1 Population phylogenetic reconstruction

We inferred the population phylogeny using half the nuclear genome (invariant and variant sites, but no mitochondrial sites) in the program raxml. Our phylogenetic tree clearly demarcates two lineages and indicates that the Rascón surface population and the Tinaja and Pachón cave populations form a monophyletic clade (Figure 1; often referred to as “old lineage”). Similarly, the Río Choy surface and Molino cave populations form a monophyletic clade (referred to as “new lineage”). Thus, this phylogeny indicates that many cavefish traits may be polyphyletic (i.e., evolved through repeated evolution).

In agreement with a previous study showing that Molino cave population has the lowest diversity of cave populations sampled, the Molino cave population exhibits shorter branches than other populations tested (Bradic et al., 2013), while the surface populations exhibit longer branch lengths. While all populations and the two lineages have high bootstrap support (≥99), bootstrap support with genomic-scale data provides little information about the evolutionary processes or the distribution of alternative topologies across the genome (Yang & Rannala, 2012); thus, we use a series of tests below to explore this further.

3.2 Diversity and divergence

Patterns of diversity within populations, π, and interpopulation divergence, d_XY, provide a broad summary of the coalescent history within and between populations. We present pairwise sequence differences at fourfold degenerate sites between all pairs of individuals (Figure 2). The “striped” individuals in Figure 2b correspond to recent hybrids as inferred by ADMIXTURE (Figure 2a). We removed these putative recent hybrids from our summary of diversity within and divergence between populations presented in Table 2.

Table 2. Inter- and intrapopulation average pairwise nucleotide diversity for across the genome (d_XY and π, respectively). Values in italics on the diagonal are the estimates of mean π for that population, values below the diagonal are d_XY for the population pair. Values above the diagonal are estimates, in generations, of the population split times (see Methods). In this table, four recent hybrids were excluded (e.g., Rascón 6, Tinaja 6, Choy 14, Tinaja E). Resulting in the sample sizes: A. aeneus: N = 2; Río Choy: N = 8; Molino: N = 9; Rascón: N = 5; Pachón: N = 9; Tinaja: N = 8 in the final calculations. N/A means the divergence time was negative. Divergence time does not take into account gene flow, therefore, is likely underestimated

	A. aeneus	Río Choy	Molino	Pachón	Rascón	Tinaja
A. aeneus	0.0035	398,239	372,641	276,735	281,219	264,154
Río Choy	0.00637	0.00387	113,913	96,569	156,597	118,333
Molino	0.00619	0.00438	0.00033	50,338	101,781	67,029
Pachón	0.00552	0.00426	0.00393	0.00080	139,541	N/A
Rascón	0.00555	0.00468	0.00429	0.00335	0.00237	113,321
Tinaja	0.00543	0.00441	0.00405	0.00218	0.00316	0.0092

Genomewide average diversity within cave populations (π_{4fold degen}: Molino = 0.00074, Tinaja = 0.00129, Pachón = 0.00100; Table 2) is substantially lower than diversity within surface populations (π_{4fold degen}: Río Choy = 0.00300, Rascón = 0.00207), reflecting a decrease in effective population size in caves (sensu Avise & Selander, 1972). Molino cave is the most homogenous. These results are consistent with the short branches in cave populations observed in the phylogenetic tree produced by RAxML (Figure 1). In contrast, the surface population Río Choy is the most diverse in our sample—so much so that two Río Choy fish may be more divergent than fish compared between any of the old lineage populations (e.g., Tinaja cave and Rascón sruface).

Genomewide average divergence between the new and old lineages (d_{XY 4fold degen} = 0.00340 − 0.00374, Table 2) exceeds divergence between cave and surface populations within lineages (d_{XY 4fold degen} Molino-Río Choy = 0.00332; Tinaja-Rascón = 0.00285; Pachón-Rascón = 0.00298; Table 2). Thus, old and new lineages diverged prior to (or have experienced less genetic exchange than) any of the cave–surface population pairs within lineages. The two old lineage caves are the least diverged populations in our sample (d_{XY 4fold degen}: Pachón-Tinaja = 0.00225). These results are consistent with both the observation of monophyly of old and new lineages and the observation that old lineage cave populations are sister taxa in the raxml tree (Figure 1).

Divergence between populations suggests that a simple bifurcating tree does not fully capture the history of these populations. For example, divergence between the new lineage cave population (Molino) and the geographically closer old lineage cave population (Pachón) is less than divergence between Molino and the geographically further old lineage populations (Tinaja and Rascón; Table 2). Likewise, all old lineage populations are closer to Molino cave than they are to Río Choy. These results suggest gene flow between Molino cave and the old lineage populations, which is supported by additional analyses.

3.3 Genomewide tests suggest substantial historical and contemporary gene flow

3.3.1 ADMIXTURE

We focus our discussion on K = 6, the case where the number of clusters matches our presumed number of populations (Figure 2a), and additional cluster sizes are presented in (Supporting Information Figure S1). While, individuals largely cluster exclusively with others from their sampled population, we also observe contemporary gene flow between cave and surface populations, as well between new and old lineage taxa (Figure 2a, Supporting Information Figure S1).

Specifically, there appears to be reciprocal gene exchange between the Tinaja cave and the old lineage surface population (a Tinaja individual shares 22% cluster membership with Rascón surface and a Rascón individual shares 29% membership with Tinaja; Figure 2a, Supporting Information Figure S1 and S2). Another Tinaja sample appears to be a recent hybrid with the new lineage surface population (a Tinaja individual with 14% membership with Río Choy). One new lineage surface sample appears to have recent shared ancestry with samples from Pachón (i.e., Río Choy individual with 12% cluster membership with Pachón).

Although unsupervised clustering algorithms may show signs of admixture even when none has occurred (Falush, van Dorp, & Lawson, 2018), corroboration of ADMIXTURE results with supporting patterns of pairwise sequence divergence (Figure 2, Supporting Information Figure S1 and S2) in putatively admixed samples (Rascón 6, Tinaja 6, Choy 14, Tinaja E) strongly suggests that recent hybrids between cave and surface populations and old and new lineages are present in current-day populations.

While the existence of recent hybrids is consistent with long-term genetic exchange, such hybrids do not provide evidence for historical introgression. We, therefore, rigorously characterize the history of gene flow in this group, while removing putative recent hybrids identified from ADMIXTURE and pairwise sequence divergence to ensure that our historical claims are not driven by a few recent events.

3.3.2 F₃ and F₄ statistics

To test for historic genetic exchange, we calculated F₃ and F₄ “tree imbalance tests,” after removing recent hybrids identified by ADMIXTURE. Throughout the section below, gene flow is supported by F₄ statistics between the two underlined taxa and/or the two non-underlined taxa in the four-taxa tree. Further, we interpret the F₄ statistics in relation to the ranking of the F₄ scores. Extreme F₄ scores may be the result of both pairs (e.g., A–C and B–D) exchanging genes, amplifying the F₄ score beyond what the score would be if gene flow occurred only between a single pair within the quartet.

F₃ and F₄ tests convincingly show that Rascón, the old lineage surface population, has experienced gene flow with a lineage more closely related to A. aeneus than to the other individuals in our sample (Table 3). This claim is supported by numerous observations including the observations that, while all F₃ tests including Rascón are significantly negative (i.e., supporting admixture), the most extreme F₃ statistic is from a test with Rascón as the target population and A. aeneus and Tinaja as admixture sources, reflecting that both A. aeneus and Tinaja have likely hybridized extensively with Rascón. Additionally, all F₄ tests of the form (A. aeneus, new lineage; old lineage cave, Rascón) are significantly negative, meaning that Rascón is genetically closer to A. aeneus than expected given a simple tree (Supporting Information Figure S3). Therefore, A. aeneus is an imperfect outgroup to A. mexicanus. Our interpretation is that A. aeneus and Rascón hybridized too deep in the past to be detected by ADMIXTURE analysis, but sufficiently recently to be detected by F₃ and F₄ tests. This is further supported by an analysis included in the supplemental materials that did not find long blocks of sequence similarity to A. aeneus in the Rascón genome.

Table 3. F₃ statistics for significant configurations out of all possible three population configurations (X; A, B) where X is the population tested for admixture. A significantly negative F₃ statistic indicates admixture from populations related to A and B. Notably, we conducted the F₃ statistics without individuals showing recent evidence of admixture. Sample sizes were as follows: A. aeneus: N = 2; Río Choy: N = 8; Molino N = 9; Rascón N = 5; Pachón N = 9; Tinaja N = 8. Any z-score below −1.645 passes the critical value for significance at α = 0.05. Tests are ordered most-least extreme z-scores. All other confirmations were not significant for introgression

(X; A, B)	F₃-statistic	SE	z-score
Rascón; A. aeneus, Tinaja	−9.25e-04	5.63e-05	−16.42
Rascón; Río Choy, Tinaja	−7.34e-04	4.97e-05	−14.78
Rascón; Molino, Tinaja	−7.21e-04	5.17e-05	−13.94
Rascón; A. aeneus, Pachón	−5.52e-04	6.05e-05	−9.13
Rascón; A. aeneus, Molino	−3.73e-04	6.53e-05	−5.71
Rascón; Río Choy, Pachón	−2.04e-04	6.03e-05	−3.38
Rascón; Molino, Pachón	−1.46e-04	6.18e-05	−2.36
Río Choy; A. aeneus, Molino	−1.02e-04	5.51e-05	−1.84

Our results also show that the new lineage surface population, Río Choy, experienced gene flow with an unsampled outgroup. A significantly negative F₃ value demonstrates Río Choy (the new lineage surface population) is an admixture target with Molino (new lineage cave) and A. aeneus as admixture sources (Table 3). This claim is bolstered by significantly positive F₄ values in all comparisons of the form (A. aeneus, old lineage; Río Choy, Molino) in Supporting Information Figure S3. However, because d_XY between A. aeneus and both new lineage populations are nearly equivalent, we suggest this is an example of “the outgroup case” for F₃ (Patterson et al., 2012) in which admixture is attributed to an uninvolved outgroup (in this case, A. aeneus) rather than the unsampled admixture source.

We uncover evidence for admixture between the old lineage cave population, Pachón, and both new lineage populations (Molino cave and Río Choy surface), as observed in recent hybrids in ADMIXTURE. Specifically, the significantly negative F₄ values for (A. aeneus, new lineage; Pachón, Tinaja) (Supporting Information Figure S3) are consistent with gene flow between both new lineage populations and the old lineage Pachón cave, and/or gene flow between Tinaja and A. aeneus. Because no other evidence suggests gene flow between Tinaja and A. aeneus, we argue that this result reflects gene flow between Pachón and the new lineage populations. This claim is further supported by the observation that d_XY between new lineage populations and Pachón is consistently lower than d_XY between new lineage populations and the other old lineage populations (Table 2).

Additionally, our analyses are consistent with gene flow between the old lineage Tinaja cave and the old lineage surface population, Rascón. Again, this result complements the discovery of two recent hybrids between these populations in our ADMIXTURE analysis. This claim is supported by the significantly positive F₄ value for (A. aeneus, Rascón; Pachón, Tinaja). Since we lack evidence of Pachón – A. aeneus admixture, this significant F₄ statistic is likely driven by Rascón – Tinaja admixture.

Despite gene flow among most population pairs, there is one case in which tree imbalance tests failed to reject the null hypothesis of a bifurcating tree: F₄ tests of the form (new lineage, new lineage; old lineage, old lineage) (Supporting Information Figure S3). This result is counter to both our observation of recent hybrids (between the new and old lineages observed in ADMIXTURE), and the gene flow observed in other F₄ – statistics, as well as treemix (Figures 3, Supporting Information Figure S4), d_XY nearest neighbour proportions (Supporting Information Figures S5–S7) and phylonet (Supporting Information Figure S8). In these cases, it is also plausible that F₄ scores that do not differ from zero may reflect opposing admixture events that cancel out a genomewide signal, rather than an absence of introgression (see Reich et al., 2009).

3.3.3 TreeMix

treemix allows us to visualize population relationships as a graph depicting directional admixture given a number of migration events, and uses the covariance structure of allele frequencies among populations rather than allele frequency difference correlations (e.g., F₃ and F₄). By examining multiple runs with different levels of migration, the treemix run with five migration events best explains the sample covariance (Supporting Information Figure S4).

treemix illustrates gene flow from the ancestral new lineage into both Rascón surface and Pachón cave (Figure 3; also suggested by F₃ and F₄ statistics), while providing evidence for gene flow from the lineage leading to A. aeneus samples to the Río Choy and Rascón lineages (also suggested by F₃ and F₄ statistics). Migration from Pachón cave into Tinaja cave is indicated (also supported by ∂a∂i modeling). However, the topology recovered by TreeMix places Pachón as the outgroup to (Rascón, Tinaja); thus, Pachón to Tinaja gene flow likely (partially) reflects this incorrect tree structure.

Care should be taken in interpreting the migration arrows drawn on the treemix result. First, we limited the number of migration edges to five, so further (possibly real) migration events are not represented. Second, any particular inferred admixture event is necessarily unidirectional and the source population is designated as the population with a migration weight ≤50%. Thus, while directionality is hard to pin down with these analyses, we can confidently report admixture among and between lineages and habitat types.

3.4 Divergence times are younger than expected without accounting for migration

The levels of diversity and divergence allow us to calculate a simple estimate of population split times. This approach results in remarkably recent divergence time estimates. For example, we estimate the oldest split between old and new lineages of approximately 156,597 generations (Table 2; using π_A. aeneus as a proxy for ancestral diversity, and representing the old-new lineage split by d_{XY Choy,Rascón}, ((4.68 − 3.58) × 10⁻³)/ (7 × 10⁻⁹). Assuming the generation interval of 1 year, the oldest estimate for the old and new split is more than an order of magnitude less than that estimated from the cytochrome b which places the divergence time between lineages between 5.7 and 7.5 mya (Ornelas-García et al., 2008). Similarly, we estimate that the split between new lineage cave (Molino) and new lineage surface (Río Choy) populations was approximately 113,913 generations ago ((4.38 − 3.58) × 10⁻³)/(7 × 10⁻⁹) similar to a recent analyses (Fumey et al., 2018). However, other estimates using this method were unstable, suggesting that A. aeneus is not a good proxy for ancestral diversity in the old lineage and/or introgression obscures any realistic estimate of divergence time. Our results that suggest gene flow (even with the exclusion of recently admixed individuals) indicate that true divergence times exceed those estimated from comparisons of d_XY and π. We, therefore, pursue a model-based estimate of population divergence time while accounting for introgression.

3.5 Demographic modelling indicates cave populations are younger than expected when accounting for migration

Demographic modelling using ∂a∂i revealed extensive interdependence and contact for all populations studied. In all cases, models with multiple migration rates fit the population comparisons best, suggesting that some genomic regions are more recalcitrant to gene flow than others. For most pairwise population comparisons, the best-fitting models supported a period of isolation followed by a period of secondary contact (SC2M; Supporting Information Table S6). The only exceptions were the comparisons between Molino-Rascón and Pachón-Rascón that supported divergence with isolation (IM2M) slightly more than SC2M (Supporting Information Table S6). The distributions of divergence times for IM2M were multimodal (Molino-Rascón) or nearly flat (Pachón-Rascón) (Supporting Information Figure S9); thus, we present results from SC2M models that exhibit much tighter distributions for divergence estimates (Table 4; Supporting Information Figure S9).

Table 4. ∂a∂i demographic modelling of divergence times between pairwise populations for 50 replicates per model. For population pairs that favoured the IM2M model (e.g., isolation with migration and heterogeneity of migration across the genome), we also report results of SC2M (e.g., a period of isolation followed by secondary contact and heterogeneity of migration across the genome) which was the favoured model for most pairwise population comparisons. Ts is the number of generations from population split to the start of secondary contact. Tsc is the number of generations from the start of secondary contact to the present. Ts + Tsc is the total divergence time in generations. Generation interval is assumed to be 1 year

Pop1	Pop2	Model	MedianTs	MinTs	MaxTs	MedianTsc	MinTsc	MaxTsc	MedianTS+TSC
Choy	Molino	SC2M	91,270	0	1,031,510	71,268	0	2,061,383	162,537
Choy	Pachón	SC2M	174,483	0	5,894,387	6,690	0	1,944,141	181,173
Choy	Rascón	SC2M	241,747	7,018	263,584	14,928	0	274,878	256,675
Choy	Tinaja	SC2M	182,107	0	497,367	24,859	0	519,170	206,966
Molino	Pachón	SC2M	195,724	60	641,127	30,431	0	603,939	226,155
Molino	Rascón	IM2M	1,242,575	228,667	1,799,441	NA	NA	NA	NA
Molino	Rascón	SC2M	210,396	0	1,593,512	27,023	0	1,044,418	237,418
Molino	Tinaja	SC2M	119,177	0	563,918	34,819	0	290,502	153,996
Pachón	Rascón	IM2M	1,239,971	140,387	11,358,780	NA	NA	NA	NA
Pachón	Rascón	SC2M	148,702	27,575	655,675	12,560	0	644,435	161,262
Pachón	Tinaja	SC2M	102,245	0	261,637	13,277	4,358	137,507	115,522
Rascón	Tinaja	SC2M	170,279	0	405,939	20,370	0	399,350	190,650

Effective population sizes for the surface populations were an order of magnitude higher than for the cave populations, and Molino exhibited the lowest effective population size of all five populations (Supporting Information Table S8). Notably, even for cave populations, estimates of effective population size were an order of magnitude larger than previous estimates (Avise & Selander, 1972; Bradic et al., 2012) and on par with mark-recapture census estimates (~8,500 fish in Pachon cave with wide 95% confidence interval of 1,279–18,283; Mitchell et al., 1977).

∂a∂i demographic modelling estimated deeper split times for populations than the divergence estimates calculated as T = (d_XY − π_ancestral)/2μ (above) because they account for introgression that can artificially deflate these divergence estimates. The oldest divergence time between old and new lineages was estimated at 256,675 generations before present (Río Choy-Rascón), (Table 4, Supporting Information Tables S9 and S10; Figure S9; using SC2M models for all comparisons). Importantly, both the model-independent method and the demographic estimates presented here reveal relatively similar divergence times between the old and new lineages (~157 vs. ~257 k generations ago), and these estimates differ substantially from the several million year divergence time obtained through mtDNA (Ornelas-García et al., 2008), unless we have dramatically underestimated generation interval in the wild.

Cave–surface splits for both the new and old lineage comparisons were remarkably similar (Pachón – Rascón: 161,262 generations; Tinaja – Rascón: 190,650 generations; Molino – Río Choy: 162,537 generations). While the two old lineage cave populations (Pachón – Tinaja) are estimated to have split slightly more recently (115,522 generations before present), suggesting that colonization of one of the two old lineage caves was potentially from subterranean gene flow (as suggested by Espinasa & Espinasa, 2015). Distributions and estimates across 50 replicates are given in Supporting Information Figure S9, Tables S9 and S10.

Subterranean gene flow may be higher than surface to cave or between-lineage surface gene flow. The highest rates of gene flow are estimated between caves, and surprisingly, Molino cavefish exhibited a migration rate into Tinaja cave that is higher than Tinaja gene flow into Pachón cave. Surface fish gene flow rates into cave populations are also among the highest rates (especially Río Choy into all sampled caves). Notably, several caves also have relatively high rates of gene flow into surface populations (Molino into Río Choy and Tinaja into Rascón).

Heterogeneity in gene flow across the genome is also different across population pairs. Generally, more of the genome of cave–surface pairs (48%) seems to follow the lower migration rate than in cave–cave pairs (40%) and surface–surface pair (18%), suggesting the possibility that strong selection for habitat-specific phenotypes slows gene flow across the genome.

3.6 Modelling of selection needed for cave alleles to reach high frequency

Similarly to that estimated by (Cartwright et al. 2017), selection coefficients of ~0.01 are needed for cave phenotype alleles to reach high frequencies in either Molino or Tinaja cave populations with a 12-locus additive model (Figure 4). Animations of allele frequencies at different levels of selection across the historical demography are provided in Supplementary materials. There is an increase in noise at the secondary contact period because the influx of new alleles. This has little effect overall because the migration rate is of the same magnitude as the mutation rate and may be because prior to secondary contact, the loci are typically fixed for either the cave or surface allele and after contact, the loci may become more polymorphic. The average selection coefficients appear to approximately match the pre-secondary contact average.

In sum, our data support that cavefish population sizes are sufficiently large for selection to play a role shaping traits, gene flow is sufficiently common to impact repeated evolution, and cave–surface divergence is more recent than expected from mitochondrial gene trees.

3.7 Candidate gene regions introgressed between caves

In light of the suspected gene flow among populations, we identified genomic regions that showed molecular evolution signatures consistent with gene flow between caves and genomic regions that are likely affected by cave–surface gene flow. There were 997 5 kb regions (out of 208,354 total) that exhibited the lowest 5% of d_XY windows of the genome for each of the cave–cave comparisons (Pachón-Molino, Pachón-Tinaja, Molino-Tinaja). These windows were spread across 471 scaffolds and overlapped 500 genes. We highlight some of these 500 genes in the context of Ensembl phenotype data in Supplementary text (see also Supporting Information Table S11).

Several of these regions cluster with known QTLs (O'Quin & McGaugh, 2015). We examined co-occurring QTLs for many traits on Linkage Group 2 and Linkage Group 17 (LG2, LG17, Supporting Information Table S12), which were two main regions highlighted in recent work on cave phenotypes (Yoshizawa, Yamamoto et al., 2012). We found that the co-occurring QTL on Linkage Group 2 harboured about 1.5-fold (odds ratio = 1.54, 95% CI = 1.139–2.089) the number of regions with low divergence among all three cave populations (44 windows with all three caves in lowest 5% of d_XY/6070 total windows that had data across these same scaffolds) relative to the total across the entire genome (997/208354). This suggests that this region with many co-occurring QTL was potentially transferred among caves. Notably, we did not find a similar pattern for linkage group 17 (proportion of windows across all three caves in 5% lowest d_XY = 0.00482 for LG17 and 0.00479 across the entire genome). The region on Linkage Group 2 is not simply an area of high sequence conservation (which could account for low divergence among all three cave populations) because the mean divergence between surface-surface comparisons for this region is very similar to the genomewide average (genomewide d_XY = 0.00481, LG2 QTL region d_XY = 0.00482).

We find genes that look to be affected by gene flow between caves. We found 1004 genes that exhibited d_XY = 0 and another 370 genes with d_XY in the lowest 5% across the genome for the three pairwise comparisons of the caves, many with substantial divergence to at least one surface population (Supporting Information Table S13). We associated these genes with previously known QTL with linkage groups that follow (O'Quin et al., 2013) and a qtl database is provided (Supporting Information Table S7). One example is fam136a (family with sequence similarity 136, member A), which is under a QTL for body condition on Linkage Group 10. This gene is expressed in the hair cells of the crista ampullaris, an organ important for detecting rotation and acceleration, in the semicircular canals of the inner ear in the rat (Requena et al., 2014). All caves are identical except for a SNP that may have come from the Rascon population in one of the admixed Tinaja individuals. fam136a is in the top 10% most divergent genes by d_XY for Rascón surface—Pachón cave and Rascón surface—Tinaja cave populations (Molino-Rascón surface comparisons fall in the top 13% most divergent genes via d_XY). Notably, Rascon and A. aeneus exhibit an E (glutamic acid) at 15aa (suggesting this is the ancestral state), whereas Río Choy and all caves exhibit a G (glycine). This amino acid switch is between a polar and a hydrophobic, thus not very functionally conserved. Such pattern could be produced by gene flow among caves or by gene flow of Río Choy with the cave populations.

4 DISCUSSION

The role of admixture between diverging populations is increasingly apparent as the application of genomic analyses has become standard, and such gene flow shapes the study of repeated evolution (Colosimo et al., 2005; Cresko et al., 2004; Roesti et al., 2014; Van Belleghem et al., 2018; Welch & Jiggins, 2014). Further, gene flow may create a signature that resembles parallel genetic divergence as effective migration is reduced in the same genomic regions for multiple ecotypic pairs upon secondary contact (Bierne, Gagnaire, & David, 2013; Rougemont et al., 2017). Thus, to understand the repeated origins of traits, an accurate understanding of the demography is required (Dasmahapatra et al. 2012; Rosenblum et al., 2014).

Astyanax mexicanus have been increasingly studied as a model of repeated evolution of wide-ranging traits (Elmer & Meyer, 2011; Krishnan & Rohner, 2017). Our data provide insight into some of the most pressing demographic questions needed to understand repeated evolution: (a) whether there are independent origins of Astyanax cavefish, (b) the age of cave invasions, (c) the amount of gene flow between populations, and (d) the strength of selection needed to shape traits. Genomic resequencing allowed an understanding not afforded by mitochondrial or reduced representation nuclear sequencing, and we were able to identify genes and a region of the genome potentially affected by gene flow between populations. Together, these findings provide a framework for understanding the repeated evolution of many complex, cave-derived traits.

4.1 Cave populations are younger than previously estimated

Our estimates fit in well with previous suggestions that the caves were colonized in the late Pleistocene (Avise & Selander, 1972; Fumey et al., 2018; Porter et al., 2007; Strecker et al., 2004) with both our d_XY-based divergence estimation and demographic modelling. As our work documents extensive gene flow, we favour the divergence time estimates provided by demographic models that incorporated estimates of migration. Demographic models estimate split time between the two lineages at approximately 257 k generations ago, and estimates of cave–surface population splits are ~161–191 k generations ago. Notably, Astyanax is a species of fish with a limited distribution in northern latitudes, with its current most northern locality in the Edwards Plateau in Texas (Page & Burr, 2011). These split times are consistent with cooler temperatures in Northern Mexico associated with glaciation playing some role in the colonization of the caves by Astyanax mexicanus, potentially as thermally stable refugia (Cussac, Fernández, Gómez, & López, 2009). Our nuclear genomic data suggest that the “old” and “new” lineage split is more recent than the main volcanic activity of the Trans-Mexican Volcanic Belt (3–12 Mya) which was thought to have separated these two lineages, as well as other lineages in other vertebrates (Ornelas-García et al., 2008). This geographic barrier likely has been breached by Astyanax multiple times and likely led to cave invasions with each migration (Gross, 2012; Hausdorf et al., 2011; Strecker et al., 2012).

4.2 Historic and contemporary gene flow between most populations

Several cave populations exhibit intermediate phenotypes and experience flooding during the rainy season (Espinasa, unpublished, Strecker et al., 2012), suggesting that intermediate phenotypes are the result of admixture between cavefish and surface fish swept into caves during flooding (Avise & Selander, 1972; Bradic et al., 2012). However, it has been suggested that surface fish and hybrids are too maladapted to survive and spawn in caves (Coghill et al., 2014; Hausdorf et al., 2011; Strecker et al., 2012), and that cavefish populations with intermediate troglomorphic phenotypes represent more recent cave invasions, rather than hybrids (Hausdorf et al., 2011; Strecker et al., 2012). Despite evidence from microsatellites (Bradic et al., 2012; Panaram & Borowsky, 2005), mitochondrial capture (Dowling et al., 2002; Ornelas-García & Pedraza-Lara, 2015; Yoshizawa, Ashida, & Jeffery, 2012), and haplotype sharing of candidate loci (but see Espinasa, Centone, & Gross, 2014; Gross & Wilkens, 2013), there was still uncertainty in the literature regarding the frequency of gene flow between Astyanax cavefish and surface fish and the role gene flow may play in adaptation to the cave environment (Coghill et al., 2014; Espinasa & Borowsky, 2001; Hausdorf et al., 2011; Strecker et al., 2012).

Our work demonstrates recent and historical gene flow between cave and surface populations both within and between lineages (Figures 2 and 3). This result is also suggested by past work which showed most genetic variance is within individuals (Table 2 in (Bradic et al., 2012)) and very few private alleles (i.e., alleles specific to a population) were present among cave populations (Figure 3b in (Bradic et al., 2012)). Notably, many of the methods we employed take into account incomplete lineage sorting, are robust to nonequilibrium demographic scenarios, and detect recent admixture (<500 generations ago) (Patterson et al., 2012). In all our analyses the hybridization detected may not be between these sampled populations directly, but unsampled populations/lineages related to them (Patterson et al., 2012).

One of the most intensely studied cave populations is Pachón. Our data of Pachón cavefish hybridization with the new lineage are expected from past field observations and molecular data. Only extremely troglomorphic fish were found early (1940s–1970s; Avise & Selander, 1972; Mitchell et al., 1977) and late (1996–2000, (Dowling et al., 2002)) surveys, though, phenotypically intermediate fish were observed in 1986–1988, as well as in 2008 (Borowsky, unpublished). Thus, subterranean introgression from a nearby cave population may cause transient complementation of phenotypes or hybridization with surface fish (from flooding or human-introduction) may contribute to the presence of intermediate-phenotype fish in the cave (Langecker, Wilkens, & Junge, 1991; Wilkens & Strecker, 2017). Indeed, past studies suggested gene flow between Pachón and the new lineage populations, as Pachón mtDNA clusters with new lineage populations (Dowling et al., 2002; Ornelas-García & Pedraza-Lara, 2015; Ornelas-García et al., 2008; Strecker et al., 2003, 2012). Interestingly and divergent from past studies, Pachón mitochondria group with old lineage populations in the most recent analysis (Coghill et al., 2014). Continued intense scrutiny of other caves may reveal similar fluctuations in phenotypes and genotypes.

The most surprising signal of gene flow in our data is seen in the ∂a∂i analyses, in which the signal of Molino-Tinaja exchange is similar to that from Pachón-Tinaja, suggesting subterranean gene flow between all three caves, despite substantial geographic distances (the entrances are separated by >100 km; Mitchell et al., 1977). While our genomewide ancestry proportion data suggest some exchange of Molino with Tinaja (Supporting Information Figure S5–S7), we have no other evidence to corroborate this signal. Thus, the results from population demography could be picking up a signal of Molino with Tinaja that is actually driven by Pachón—new lineage hybridization and subsequent Tinaja—Pachón hybridization or Tinaja—Río Choy hybridization. Also, surprisingly, ∂a∂i analyses suggest that some of the highest migration occurs from Tinaja cave into Rascón surface and Molino cave into Río Choy surface.

Notably, many species of troglobites other than Astyanax inhabit caves spanning from southern El Abra to Sierra de Guatemala, and some were able to migrate among these areas within the last 12,000 years (Espinasa, Bartolo, & Newkirk, 2014). Due to the extensive connectivity and the span of geological changes throughout the hydrogeological history, it is likely that there have been ample opportunities for troglobites, including Astyanax, to migrate across much of the El Abra region (Espinasa & Espinasa, 2015).

Across many systems genomic data has revealed reticulate evolution is much more common than previously thought (Abbott, Barton, & Good, 2016; Arnold & Kunte, 2017; Brandvain et al., 2014; Dasmahapatra et al. 2012; Geneva, Muirhead, Kingan, & Garrigan, 2015; Malinsky et al., 2017; McGaugh & Noor, 2012; Rougemont et al., 2017), and the ability of populations to maintain phenotypic differences despite secondary contact and hybridization is increasingly appreciated (Arnold & Kunte, 2017; Fitzpatrick, Gerberich, Kronenberger, Angeloni, & Funk, 2015; Malinsky et al., 2017; Payseur & Rieseberg, 2016). Thus, despite gene flow from the surface into caves, it is entirely possible that cave-like phenotypes can be maintained. Indeed, gene flow may help to sustain cave populations from the effects of inbreeding (Åkesson et al., 2016; Ellstrand & Rieseberg, 2016; Fitzpatrick et al., 2016; Frankham, 2015; Kronenberger et al., 2017; Whiteley, Fitzpatrick, Funk, & Tallmon, 2015) or catalyze adaptation to the cave environment (sensu Clarkson et al., 2014; Meier et al., 2017; Richards & Martin, 2017). Though the rates of gene flow between Astyanax populations is considerably less than that documented in other contemporary diverging fish lineages (coastal and marine anchovies, Le Moan, Gagnaire, & Bonhomme, 2016; parasitic river and nonparasitic brook lampreys, Rougemont et al., 2017; benthic and dwarf limnetic lake whitefish, Rougeux et al., 2017; Atlantic and Mediterranean sea bass, Tine et al., 2014), they are similar to the historic gene flow between of polar bears into brown bears (Liu et al., 2014). Our data indicate hybridization between sampled lineages and suggest that cavefish are poised to be a strong contributor to understanding the role gene flow may play in repeated evolutionary adaptation.

4.3 Candidate genes shaped by gene flow

In some cases, we have evidence that particular alleles were transferred between caves, but are highly diverged from surface populations (Supporting Information Table S11 and S13). This suggests that gene flow between caves may hasten adaptation to the cave environment and suggests that repeated evolution in this system may, in part, rely on standing genetic variation. Notably, scaffolds under the co-localizing QTL on Linkage Group 2 for many traits exhibit 1.5-fold enrichment for 5 kb regions with lowest divergence across caves relative to genomewide, suggesting parts of this QTL have spread throughout caves.

For genes with high similarity among caves and substantial divergence between cave and surface populations, many appeared to be involved in classic cavefish traits such as pigmentation and eye development/morphology (mdkb, atp6v0ca, ube2d2 l, usp3 and cln8) and circadian functioning (usp2a) (Supporting Information Table S13). Additional common annotations suggest future areas of trait investigation, namely, cardiac-related phenotypes (mmd2, cyp26a1, tbx3a, tnfsf10, alx1 and ptgr1) as well as inner ear phenotypes (ncs1a, dlx6a, fam136a) (Supporting Information Table S13). Importantly, the sensory hair cells of the inner ear are homologous with sensory hair cells of the neuromasts of the lateral line (Fay & Popper, 2000); therefore, these genes may be impacting mechanoreception or sound reception.

4.4 Effective population sizes are much larger than expected and weak selection can drive cave phenotypes

Previous estimates of very small effective population sizes in cave populations of A. mexicanus suggested drift and relaxed selection shaped cave-derived phenotypes (e.g., Lahti et al. 2009). The estimates of nucleotide diversity and N_e provided here (Tables 1 and 2 and Supporting Information Table S8) indicated that positive selection coefficients need not be extreme to drive cave-derived traits (Akashi, Osada, & Ohta, 2012; Charlesworth 2009). To put these values in perspective, the cave populations have an average genetic diversity (which is proportional to N_e) somewhat similar to humans and the surface populations exhibit a similar genetic diversity to zebrafish (Leffler et al. 2012).

Our observations support theoretical and empirical results that selection likely shaped cavefish phenotypes (Borowsky, 2015; Cartwright et al. 2017; Moran, Softley, & Warrant, 2014). Simulations using demographic parameters from Molino and Tinaja cave populations suggest that selection coefficients across 12 additive loci (patterned after the number of eye-related QTL (O'Quin & McGaugh, 2015)) need to be above 0.01 to bring cave alleles to very high frequencies (Figure 4). Such selection coefficients are often found driving selective sweeps in natural systems (Nair et al., 2003; Rieseberg & Burke, 2001; Schlenke & Begun, 2004; Wootton et al., 2002).

4.5 Diverging lineages should be examined with reticulation methods and absolute divergence metrics

Our genomewide data allowed a unique perspective that was not available in past studies. First, our data represent an empirical demonstration of how well-supported phylogenetic reconstructions can be misleading. When using genome-scale data with maximum likelihood, bootstrap values are not a measure of the number of sites that support a particular phylogeny, though, this is often how they are interpreted (Yang & Rannala, 2012). Rather, if a genome-scale data set is slightly more supportive of a particular topology, maximum likelihood will find that topology consistently and exhibit high levels of bootstrap support (Yang & Rannala, 2012). With the cavefish, a previous RADseq study suggested relatively strongly supported branches (Coghill et al., 2014); however, a much more complex evolutionary history with reticulation among lineages was revealed here. Examination of recently diverged taxa with reticulation methods (e.g., Phylonet, F₄, F₃, TreeMix) may ensure a more comprehensive view of their evolutionary history than the successive bifurcations represented in a tree.

Second, our data set is yet another empirical demonstration of poor performance of pairwise F_ST in estimating population relationships when diversity is highly heterogeneous among populations (Figure 5, Table 2, Supporting Information Table S14; Fumey et al., 2018; Jakobsson, Edge, & Rosenberg, 2013). When diversity is highly heterogeneous, this can give the false impression that low-diversity populations are highly divergent, when in reality, the lower diversity drives greater pairwise F_ST, not higher absolute divergence (Charlesworth, 1998). These limitations of F_ST are especially important to appreciate in systems like the cavefish where diversity is highly heterogeneous across populations. Further, it is often suggested that high F_ST translates to low gene flow, but violations of the assumptions are common (Cruickshank & Hahn, 2014). In the case of Molino (a very low diversity population), high F_ST values were taken to indicate relative isolation from other populations with little gene flow (Bradic et al., 2012), and we show here that is not the case. Indeed, pairwise F_ST among surface fish populations is the lowest and pairwise F_ST among caves is the highest, yet absolute divergence between surface populations is slightly higher than absolute divergence among caves (Figure 5, Table 2 and Supporting Information Table S14). Future molecular ecology work should assay diversity and interpret pairwise F_ST accordingly or use absolute measures of divergence in conjunction with pairwise F_ST (Charlesworth, 1998; Cruickshank & Hahn, 2014; Noor & Bennett, 2009; Ritz & Noor, 2016).

5 CONCLUSION

In conclusion, our results suggest that investigations of repeated evolution of cave-derived traits should take into account hybridization between lineages, between cave and surface fish, and between caves. Due to gene flow, best practices to assign the putative ancestral character state of cave-derived traits include comparing cavefish phenotypes to multiple surface populations and a distantly related outgroup with no evidence of admixture (e.g., Astyanax bimaculatus, Ornelas-García et al., 2008). Complementation crosses and molecular studies remain essential for understanding evolutionary origins for each cave-derived trait (Borowsky, 2008b; Gross, Borowsky, & Tabin, 2009; O'Quin et al., 2015; Protas et al., 2006; Wilkens, 1971; Wilkens & Strecker, 2003).

We expect future work with greatly expanded sampling (Beerli, 2004; Hellenthal et al., 2014; Pease & Hahn, 2015; Slatkin, 2005) and ecological parameterization of the caves to provide a more comprehensive view of the ultimate drivers of demography of Astyanax mexicanus cavefish. Interestingly, many caves are nutrient-limited which is suggested to be one of the largest impediments to surface fish survival in the cave environment (Espinasa, Bibliowicz, Jeffery, & Rétaux, 2014; Moran et al., 2014). One of the emerging hypotheses is that high food availability allows surface-like fish to persist longer in the cave environment and enhances the probability for cave–surface hybridization (Mitchell et al., 1977; Strecker et al., 2012). Future studies evaluating hybridization levels in relation to food-availability or food-predictability are an important next step to examine drivers of cave phenotypes. We look forward to these future studies, and their potential to elucidate how demography and environmental factors impact repeated adaptive evolution (sensu Rosenblum et al., 2014).

ACKNOWLEDGEMENTS

Fish were collected under CONAPESCA permit PPF/DGOPA - 106 / 2013 to Claudia Patricia Ornelas García and SEMARNAT permit 02241 to Ernesto Maldonado. We thank the Mexican government for providing the collecting permit to R.B. in 2008 (DGOPA.00570.288108-0291). For 2002, the collection permit to R.B. was from fisheries department #01.01.02.613.03.1799 Molino samples were obtained under Mexican permit 040396-213-05. Animal care protocol numbers include #05-1235 by the New York University Animal Welfare Committee (UAWC) to R.B., UMD R-17-77 to WRJ, and UNAM animal care protocol to POG NOM-062-ZOO-1999. This work was supported by NIH grant 2R24OD011198-04A1 to WCW, The Genome Institute at Washington University School of Medicine, Cave Research Foundation Graduate Student Research Grant to BMC, a grant from the Eppley Foundation for Research to SEM, 5R01EY014619-08 to WRJ, and 1R01GM127872-01 to SEM and ACK. Raw sequence data were submitted to the SRA. Project Accession Number: SRP046999, Bioproject: PRJNA260715. We appreciate the resources provided by the Minnesota Supercomputing Institute, without which this work would not be possible.

DATA ACCESSIBILITY

All reads are available in NCBI short read archive under accession numbers given in Supporting Information Table S3. Supporting Information Methods, figures and tables are provided online. Within the supplementary tables are five separate spreadsheet tabs that detail genomic regions and genes that may be spread between caves, complete results from all demography replicates, and enrichment analyses of two QTL regions. Two simulations of selection, given our demography estimates, are included in online supplementary material. Scripts to perform demography analysis are available in a GitHub repository at https://github.com/TomJKono/CaveFish_Demography.

AUTHOR CONTRIBUTION

A.H. conducted portions of the analysis and helped write the paper. Y.B. conducted some analyses and edited the paper, J.W. processed DNA sequences. A.C.K. helped write the paper. T.J.Y. Kono conducted the Dadi analysis and helped write the paper. W.R.J., K.O., C.P.O-G., M.Y., B.C., E.M.J.B.G., R.H., H.B., R.B., L.E. helped with sampling and writing the paper. W.C.W. and S.E.M. planned the project. R.A.C. conducted demographic simulations and helped write the paper. S.E.M. carried out portions of the analyses and wrote the paper.

Supporting Information

REFERENCES

Abbott, R. J., Barton, N. H., & Good, J. M. (2016). Genomics of hybridization and its evolutionary consequences. Molecular Ecology, 25, 2325–2332. https://doi.org/10.1111/mec.13685
10.1111/mec.13685
PubMed Web of Science® Google Scholar
Agrawal, A. A. (2017). Toward a predictive framework for convergent evolution: Integrating natural history, genetic mechanisms, and consequences for the diversity of life. The American Naturalist, 190, S1–S12. https://doi.org/10.1086/692111
10.1086/692111
PubMed Web of Science® Google Scholar
Akashi, H., Osada, N., & Ohta, T. (2012). Weak selection and protein evolution. Genetics, 192, 15–31.
10.1534/genetics.112.140178
CAS PubMed Web of Science® Google Scholar
Åkesson, M., Liberg, O., Sand, H., Wabakken, P., Bensch, S., & Flagstad, Ø. (2016). Genetic rescue in a severely inbred wolf population. Molecular Ecology, 25, 4745–4756. https://doi.org/10.1111/mec.13797
10.1111/mec.13797
PubMed Web of Science® Google Scholar
Alexander, D. H., & Lange, K. (2011). Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinformatics, 12, 1.
10.1186/1471-2105-12-246
PubMed Web of Science® Google Scholar
Alexander, D. H., Novembre, J., & Lange, K. (2009). Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19, 1655–1664. https://doi.org/10.1101/gr.094052.109
10.1101/gr.094052.109
CAS PubMed Web of Science® Google Scholar
Arnold, M. L., & Kunte, K. (2017). Adaptive genetic exchange: A tangled history of admixture and evolutionary innovation. Trends in Ecology & Evolution, 32, 601–611. https://doi.org/10.1016/j.tree.2017.05.007
10.1016/j.tree.2017.05.007
PubMed Web of Science® Google Scholar
Aspiras, A. C., Rohner, N., Martineau, B., Borowsky, R. L., & Tabin, C. J. (2015). Melanocortin 4 receptor mutations contribute to the adaptation of cavefish to nutrient-poor conditions. Proceedings of the National Academy of Sciences of the United States of America, 112, 9668–9673. https://doi.org/10.1073/pnas.1510802112
10.1073/pnas.1510802112
CAS PubMed Web of Science® Google Scholar
Auwera, G. A., Carneiro, M. O., Hartl, C., Poplin, R., del Angel, G., Levy-Moonshine, A., … Thibault, J. (2013). From FastQ data to high-confidence variant calls: The genome analysis toolkit best practices pipeline. Current Protocols in Bioinformatics, 43, 1–33.
10.1002/0471250953.bi1110s43
PubMed Google Scholar
Avise, J. C., & Selander, R. K. (1972). Evolutionary genetics of cave-dwelling fishes of the genus Astyanax. Evolution, 26, 1–19. https://doi.org/10.1111/j.1558-5646.1972.tb00170.x
10.1111/j.1558-5646.1972.tb00170.x
PubMed Web of Science® Google Scholar
Beerli, P. (2004). Effect of unsampled populations on the estimation of population sizes and migration rates between sampled populations. Molecular Ecology, 13, 827–836. https://doi.org/10.1111/j.1365-294X.2004.02101.x
10.1111/j.1365-294X.2004.02101.x
CAS PubMed Web of Science® Google Scholar
Bibliowicz, J., Alié, A., Espinasa, L., Yoshizawa, M., Blin, M., Hinaux, H., … Rétaux, S. (2013). Differences in chemosensory response between eyed and eyeless Astyanax mexicanus of the Rio Subterráneo cave. EvoDevo, 4, 25. https://doi.org/10.1186/2041-9139-4-25
10.1186/2041-9139-4-25
CAS PubMed Web of Science® Google Scholar
Bierne, N., Gagnaire, P.-A., & David, P. (2013). The geography of introgression in a patchy environment and the thorn in the side of ecological speciation. Current Zoology, 59, 72–86. https://doi.org/10.1093/czoolo/59.1.72
10.1093/czoolo/59.1.72
Web of Science® Google Scholar
Borowsky, R. (2008a). Astyanax mexicanus, the blind Mexican cave fish: A model for studies in development and morphology. Cold Spring Harbor Protocols, 2008, R23–R24. https://doi.org/10.1101/pdb.emo107
10.1101/pdb.emo107
Google Scholar
Borowsky, R. (2008b). Restoring sight in blind cavefish. Current Biology, 18, R23–R24. https://doi.org/10.1016/j.cub.2007.11.023
10.1016/j.cub.2007.11.023
CAS PubMed Web of Science® Google Scholar
Borowsky, R. (2015). Regressive evolution: Testing hypotheses of selection and drift. In A. Keene, M. Yoshizawa & S. E. McGaugh (Eds.), Biology and evolution of the Mexican cavefish (p. 93). San Diego, CA: Elsevier.
Google Scholar
Bradic, M., Beerli, P., León, F. G., Esquivel-Bobadilla, S., & Borowsky, R. (2012). Gene flow and population structure in the Mexican blind cavefish complex (Astyanax mexicanus). BMC Evolutionary Biology, 12, 9. https://doi.org/10.1186/1471-2148-12-9
10.1186/1471-2148-12-9
PubMed Web of Science® Google Scholar
Bradic, M., Teotónio, H., & Borowsky, R. L. (2013). The population genomics of repeated evolution in the blind cavefish Astyanax mexicanus. Molecular Biology and Evolution, 30, 2383–2400. https://doi.org/10.1093/molbev/mst136
10.1093/molbev/mst136
CAS PubMed Web of Science® Google Scholar
Brandvain, Y., Kenney, A. M., Flagel, L., Coop, G., & Sweigart, A. L. (2014). Speciation and introgression between Mimulus nasutus and Mimulus guttatus. PLoS Genetics, 10, e1004410. https://doi.org/10.1371/journal.pgen.1004410
10.1371/journal.pgen.1004410
CAS PubMed Web of Science® Google Scholar
Browning, S. R., & Browning, B. L. (2007). Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. The American Journal of Human Genetics, 81, 1084–1097. https://doi.org/10.1086/521987
10.1086/521987
CAS PubMed Web of Science® Google Scholar
Capella-Gutiérrez, S., Silla-Martínez, J. M., & Gabaldón, T. (2009). trimAl: A tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics, 25, 1972–1973. https://doi.org/10.1093/bioinformatics/btp348
10.1093/bioinformatics/btp348
CAS PubMed Web of Science® Google Scholar
Cartwright, R. A., Schwartz, R. S., Merry, A. L., & Howell, M. M. (2017). The importance of selection in the evolution of blindness in cavefish. BMC Evolutionary Biology, 17, 45.
10.1186/s12862-017-0876-4
PubMed Web of Science® Google Scholar
Charlesworth, B. (1998). Measures of divergence between populations and the effect of forces that reduce variability. Molecular Biology and Evolution, 15, 538–543. https://doi.org/10.1093/oxfordjournals.molbev.a025953
10.1093/oxfordjournals.molbev.a025953
CAS PubMed Web of Science® Google Scholar
Charlesworth, B. (2009). Effective population size and patterns of molecular evolution and variation. Nature Reviews Genetics, 10, 195.
10.1038/nrg2526
CAS PubMed Web of Science® Google Scholar
Clarkson, C. S., Weetman, D., Essandoh, J., Yawson, A. E., Maslen, G., Manske, M., … Donnelly, M. J. (2014). Adaptive introgression between Anopheles sibling species eliminates a major genomic island but not reproductive isolation. Nature Communications, 5, 4248.
10.1038/ncomms5248
CAS PubMed Web of Science® Google Scholar
Coghill, L. M., Hulsey, C. D., Chaves-Campos, J., García de Leon, F. J., & Johnson, S. G. (2014). Next generation phylogeography of cave and surface Astyanax mexicanus. Molecular Phylogenetics and Evolution, 79, 368–374. https://doi.org/10.1016/j.ympev.2014.06.029
10.1016/j.ympev.2014.06.029
PubMed Web of Science® Google Scholar
Colosimo, P. F., Hosemann, K. E., Balabhadra, S., Villarreal, G. Jr, Dickson, M., Grimwood, J., … Kingsley, D. M. (2005). Widespread parallel evolution in sticklebacks by repeated fixation of ectodysplasin alleles. Science, 307, 1928–1933. https://doi.org/10.1126/science.1107239
10.1126/science.1107239
CAS PubMed Web of Science® Google Scholar
Cresko, W. A., Amores, A., Wilson, C., Murphy, J., Currey, M., Phillips, P., … Postlethwait, J. H. (2004). Parallel genetic basis for repeated evolution of armor loss in Alaskan threespine stickleback populations. Proceedings of the National Academy of Sciences of the United States of America, 101, 6050–6055. https://doi.org/10.1073/pnas.0308479101
10.1073/pnas.0308479101
CAS PubMed Web of Science® Google Scholar
Cruickshank, T. E., & Hahn, M. W. (2014). Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Molecular Ecology, 23, 3133–3157. https://doi.org/10.1111/mec.12796
10.1111/mec.12796
PubMed Web of Science® Google Scholar
Cussac, V., Fernández, D. A., Gómez, S. E., & López, H. L. (2009). Fishes of southern South America: A story driven by temperature. Fish Physiology and Biochemistry, 35, 29–42. https://doi.org/10.1007/s10695-008-9217-2
10.1007/s10695-008-9217-2
CAS PubMed Web of Science® Google Scholar
Dasmahapatra, K. K., Walters, J. R., Briscoe, A. D., Davey, J. W., Whibley, A., Nadeau, N. J., … Martin, S. H. (2012). Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature, 487, 94.
10.1038/nature11041
CAS PubMed Web of Science® Google Scholar
DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V., Maguire, J. R., Hartl, C., … Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43, 491–498. https://doi.org/10.1038/ng.806
10.1038/ng.806
CAS PubMed Web of Science® Google Scholar
Dowling, T. E., Martasian, D. P., & Jeffery, W. R. (2002). Evidence for multiple genetic forms with similar eyeless phenotypes in the blind cavefish, Astyanax mexicanus. Molecular Biology and Evolution, 19, 446–455. https://doi.org/10.1093/oxfordjournals.molbev.a004100
10.1093/oxfordjournals.molbev.a004100
CAS PubMed Web of Science® Google Scholar
Duboué, E. R., Keene, A. C., & Borowsky, R. L. (2011). Evolutionary convergence on sleep loss in cavefish populations. Current Biology, 21, 671–676. https://doi.org/10.1016/j.cub.2011.03.020
10.1016/j.cub.2011.03.020
CAS PubMed Web of Science® Google Scholar
Ellegren, H. (2014). Genome sequencing and population genomics in non-model organisms. Trends in Ecology & Evolution, 29, 51–63. https://doi.org/10.1016/j.tree.2013.09.008
10.1016/j.tree.2013.09.008
PubMed Web of Science® Google Scholar
Ellstrand, N. C., & Rieseberg, L. H. (2016). When gene flow really matters: Gene flow in applied evolutionary biology. Evolutionary Applications, 9, 833–836. https://doi.org/10.1111/eva.12402
10.1111/eva.12402
PubMed Web of Science® Google Scholar
Elmer, K. R., & Meyer, A. (2011). Adaptation in the age of ecological genomics: Insights from parallelism and convergence. Trends in Ecology and Evolution, 26, 298–306. https://doi.org/10.1016/j.tree.2011.02.008
10.1016/j.tree.2011.02.008
PubMed Web of Science® Google Scholar
Espinasa, L., Bartolo, N. D., & Newkirk, C. E. (2014). DNA sequences of troglobitic nicoletiid insects support Sierra de El Abra and the Sierra de Guatemala as a single biogeographical area: Implications for Astyanax. Subterranean Biology, 13, 35. https://doi.org/10.3897/subtbiol.13.7256
10.3897/subtbiol.13.7256
Google Scholar
Espinasa, L., Bibliowicz, J., Jeffery, W. R., & Rétaux, S. (2014). Enhanced prey capture skills in Astyanax cavefish larvae are independent from eye loss. EvoDevo, 5, 35. https://doi.org/10.1186/2041-9139-5-35
10.1186/2041-9139-5-35
PubMed Web of Science® Google Scholar
Espinasa, L., & Borowsky, R. B. (2001). Origins and relationship of cave populations of the blind Mexican tetra, Astyanax fasciatus, in the Sierra de El Abra. Environmental Biology of Fishes, 62, 233–237.
10.1023/A:1011881921023
Web of Science® Google Scholar
Espinasa, L., Centone, D. M., & Gross, J. B. (2014). A contemporary analysis of a loss-of-function of the oculocutaneous albinism type II (Oca2) allele within the Micos Astyanax cave fish population. Speleobiology Notes, 6, 48–54.
Google Scholar
Espinasa, L., & Espinasa, M. (2015). Hydrogeology of caves in the Sierra de El Abra region. In A. Keene, M. Yoshizawa & S. McGaugh (Eds.), Biology and evolution of the Mexican cavefish (pp. 41–58). San Diego, CA: Elsevier.
Google Scholar
Espinasa, L., Rivas-Manzano, P., & Pérez, H. E. (2001).A new blind cave fish population of genus Astyanax: Geography, morphology and behavior. Environmental Biology of Fishes, 62, 339–344).
10.1023/A:1011852603162
Web of Science® Google Scholar
Fay, R. R., & Popper, A. N. (2000). Evolution of hearing in vertebrates: The inner ears and processing. Hearing Research, 149, 1–10. https://doi.org/10.1016/S0378-5955(00)00168-4
10.1016/S0378-5955(00)00168-4
CAS PubMed Web of Science® Google Scholar
Fenner, J. N. (2005). Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. American Journal of Physical Anthropology: The Official Publication of the American Association of Physical Anthropologists, 128, 415–423. https://doi.org/10.1002/(ISSN)1096-8644
10.1002/ajpa.20188
PubMed Web of Science® Google Scholar
Fitzpatrick, S. W., Gerberich, J. C., Angeloni, L. M., Bailey, L. L., Broder, E. D., Torres-Dowdall, J., … Chris Funk, W. (2016). Gene flow from an adaptively divergent source causes rescue through genetic and demographic factors in two wild populations of Trinidadian guppies. Evolutionary Applications, 9, 879–891. https://doi.org/10.1111/eva.12356
10.1111/eva.12356
CAS PubMed Web of Science® Google Scholar
Fitzpatrick, S., Gerberich, J., Kronenberger, J., Angeloni, L., & Funk, W. (2015). Locally adapted traits maintained in the face of high gene flow. Ecology Letters, 18, 37–47. https://doi.org/10.1111/ele.12388
10.1111/ele.12388
CAS PubMed Web of Science® Google Scholar
Frankham, R. (2015). Genetic rescue of small inbred populations: Meta-analysis reveals large and consistent benefits of gene flow. Molecular Ecology, 24, 2610–2618. https://doi.org/10.1111/mec.13139
10.1111/mec.13139
PubMed Web of Science® Google Scholar
Fumey, J., Hinaux, H., Noirot, C., Thermes, C., Rétaux, S., & Casane, D. (2018). Evidence for late Pleistocene origin of Astyanax mexicanus cavefish. BMC Evolutionary Biology, 18, 43. https://doi.org/10.1186/s12862-018-1156-7
10.1186/s12862-018-1156-7
PubMed Web of Science® Google Scholar
Geneva, A. J., Muirhead, C. A., Kingan, S. B., & Garrigan, D. (2015). A new method to scan genomes for introgression in a secondary contact model. PLoS One, 10, e0118621. https://doi.org/10.1371/journal.pone.0118621
10.1371/journal.pone.0118621
PubMed Web of Science® Google Scholar
Gompel, N., … Prud'homme, B. (2009). The causes of repeated genetic evolution. Developmental Biology, 332, 36–47. https://doi.org/10.1016/j.ydbio.2009.04.040
10.1016/j.ydbio.2009.04.040
CAS PubMed Web of Science® Google Scholar
Gross, J. B. (2012). The complex origin of Astyanax cavefish. BMC Evolutionary Biology, 12, 105. https://doi.org/10.1186/1471-2148-12-105
10.1186/1471-2148-12-105
PubMed Web of Science® Google Scholar
Gross, J. B., Borowsky, R., & Tabin, C. J. (2009). A novel role for Mc1r in the parallel evolution of depigmentation in independent populations of the cavefish Astyanax mexicanus. PLoS Genetics, 5, e1000326.
10.1371/journal.pgen.1000326
CAS PubMed Web of Science® Google Scholar
Gross, J., & Wilkens, H. (2013). Albinism in phylogenetically and geographically distinct populations of Astyanax cavefish arises through the same loss-of-function Oca2 allele. Heredity, 111, 122–130. https://doi.org/10.1038/hdy.2013.26
10.1038/hdy.2013.26
CAS PubMed Web of Science® Google Scholar
Gutenkunst, R. N., Hernandez, R. D., Williamson, S. H., & Bustamante, C. D. (2009). Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genetics, 5, e1000695.
10.1371/journal.pgen.1000695
CAS PubMed Web of Science® Google Scholar
Hausdorf, B., Wilkens, H., & Strecker, U. (2011). Population genetic patterns revealed by microsatellite data challenge the mitochondrial DNA based taxonomy of Astyanax in Mexico (Characidae, Teleostei). Molecular Phylogenetics and Evolution, 60, 89–97. https://doi.org/10.1016/j.ympev.2011.03.009
10.1016/j.ympev.2011.03.009
PubMed Web of Science® Google Scholar
Hellenthal, G., Busby, G. B. J., Band, G., Wilson, J. F., Capelli, C., Falush, D., & Myers, S. (2014). A genetic atlas of human admixture history. Science, 343, 747–751. https://doi.org/10.1126/science.1243518
10.1126/science.1243518
CAS PubMed Web of Science® Google Scholar
Hoegg, S., Brinkmann, H., Taylor, J. S., & Meyer, A. (2004). Phylogenetic timing of the fish-specific genome duplication correlates with the diversification of teleost fish. Journal of Molecular Evolution, 59, 190–203. https://doi.org/10.1007/s00239-004-2613-z
10.1007/s00239-004-2613-z
CAS PubMed Web of Science® Google Scholar
Hudson, R. R. (1990). Gene genealogies and the coalescent process. Oxford Surveys in Evolutionary Biology, 7, 44.
Google Scholar
Hudson, R. R., Kreitman, M., & Aguadé, M. (1987). A test of neutral molecular evolution based on nucleotide data. Genetics, 116, 153.
10.1093/genetics/116.1.153
CAS PubMed Web of Science® Google Scholar
Jaggard, J., Robinson, B. G., Stahl, B. A., Oh, I., Masek, P., Yoshizawa, M., & Keene, A. C. (2017). The lateral line confers evolutionarily derived sleep loss in the Mexican cavefish. Journal of Experimental Biology, 220, 284–293. https://doi.org/10.1242/jeb.145128
10.1242/jeb.145128
PubMed Web of Science® Google Scholar
Jaggard, J. B., Stahl, B. A., Lloyd, E., Prober, D. A., Duboue, E. R., & Keene, A. C. (2018). Hypocretin underlies the evolution of sleep loss in the Mexican cavefish. ELife, 7, e32637. https://doi.org/10.7554/eLife.32637
10.7554/eLife.32637
PubMed Web of Science® Google Scholar
Jakobsson, M., Edge, M. D., & Rosenberg, N. A. (2013). The relationship between FST and the frequency of the most frequent allele. Genetics, 193, 515–528. https://doi.org/10.1534/genetics.112.144758
10.1534/genetics.112.144758
PubMed Web of Science® Google Scholar
Jeffery, W. R. (2001). Cavefish as a model system in evolutionary developmental biology. Developmental Biology, 231, 1–12. https://doi.org/10.1006/dbio.2000.0121
10.1006/dbio.2000.0121
CAS PubMed Web of Science® Google Scholar
Jeffery, W. R. (2009). Chapter 8: Evolution and development in the cavefish Astyanax. Current Topics in Developmental Biology, 86, 191–221. https://doi.org/10.1016/S0070-2153(09)01008-4
10.1016/S0070-2153(09)01008-4
CAS PubMed Web of Science® Google Scholar
Keene, A., Yoshizawa, M., & McGaugh, S. E. (2015). Biology and evolution of the Mexican cavefish. Boston, MA: Elsevier Academic Press.
Google Scholar
Kowalko, J. E., Rohner, N., Linden, T. A., Rompani, S. B., Warren, W. C., Borowsky, R., … Yoshizawa, M. (2013). Convergence in feeding posture occurs through different genetic loci in independently evolved cave populations of Astyanax mexicanus. Proceedings of the National Academy of Sciences of the United States of America, 110, 16933–16938. https://doi.org/10.1073/pnas.1317192110
10.1073/pnas.1317192110
CAS PubMed Web of Science® Google Scholar
Kowalko, J. E., Rohner, N., Rompani, S. B., Peterson, B. K., Linden, T. A., Yoshizawa, M., … Tabin, C. J. (2013). Loss of schooling behavior in cavefish through sight-dependent and sight-independent mechanisms. Current Biology, 23, 1874–1883. https://doi.org/10.1016/j.cub.2013.07.056
10.1016/j.cub.2013.07.056
CAS PubMed Web of Science® Google Scholar
Krishnan, J., & Rohner, N. (2017). Cavefish and the basis for eye loss. Philosophical Transactions of the Royal Society B: Biological Sciences, 372, 20150487. https://doi.org/10.1098/rstb.2015.0487
10.1098/rstb.2015.0487
PubMed Web of Science® Google Scholar
Kronenberger, J. A., Funk, W. C., Smith, J. W., Fitzpatrick, S. W., Angeloni, L. M., Broder, E. D., & Ruell, E. W. (2017). Testing the demographic effects of divergent immigrants on small populations of Trinidadian guppies. Animal Conservation, 20, 3–11. https://doi.org/10.1111/acv.12286
10.1111/acv.12286
Web of Science® Google Scholar
Lahti, D. C., Johnson, N. A., Ajie, B. C., Otto, S. P., Hendry, A. P., Blumstein, D. T., … Foster, S. A. (2009). Relaxed selection in the wild. Trends in Ecology & Evolution, 24, 487–496.
10.1016/j.tree.2009.03.010
PubMed Web of Science® Google Scholar
Langecker, T. G., Wilkens, H., & Junge, P. (1991). Introgressive hybridization in the Pachon Cave population of Astyanax fasciatus (Teleostei: Characidae). Ichthyological exploration of freshwaters. 2(3), 209–212.
Google Scholar
Lawson, D. J., van Dorp, L., & Falush, D. (2018). A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots. Nature Communications, 9, 3258.
10.1038/s41467-018-05257-7
PubMed Web of Science® Google Scholar
Leffler, E., Bullaughey, K., Matute, D., Meyer, W., & Segurel, L. (2012). Revisiting an old riddle: What determines genetic diversity levels within species. PLoS Biol, 10, e1001388.
10.1371/journal.pbio.1001388
CAS PubMed Web of Science® Google Scholar
Le Moan, A., Gagnaire, P. A., & Bonhomme, F. (2016). Parallel genetic divergence among coastal–marine ecotype pairs of European anchovy explained by differential introgression after secondary contact. Molecular Ecology, 25, 3187–3202. https://doi.org/10.1111/mec.13627
10.1111/mec.13627
CAS PubMed Web of Science® Google Scholar
Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint. arXiv:1303.3997.
Google Scholar
Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754. https://doi.org/10.1093/bioinformatics/btp324
10.1093/bioinformatics/btp324
CAS PubMed Web of Science® Google Scholar
Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26, 589. https://doi.org/10.1093/bioinformatics/btp698
10.1093/bioinformatics/btp698
CAS PubMed Web of Science® Google Scholar
Liu, S., Lorenzen, E. D., Fumagalli, M., Li, B., Harris, K., Xiong, Z., … Wang, J. (2014). Population genomics reveal recent speciation and rapid evolutionary adaptation in polar bears. Cell, 157, 785–794. https://doi.org/10.1016/j.cell.2014.03.054
10.1016/j.cell.2014.03.054
CAS PubMed Web of Science® Google Scholar
Lohse, M., Bolger, A. M., Nagel, A., Fernie, A. R., Lunn, J. E., Stitt, M., & Usadel, B. (2012). RobiNA: A user-friendly, integrated software solution for RNA-Seq-based transcriptomics. Nucleic Acids Research, 40, W622–W627. https://doi.org/10.1093/nar/gks540
10.1093/nar/gks540
CAS PubMed Web of Science® Google Scholar
Losos, J. B. (2011). Convergence, adaptation, and constraint. Evolution, 65, 1827–1840. https://doi.org/10.1111/j.1558-5646.2011.01289.x
10.1111/j.1558-5646.2011.01289.x
PubMed Web of Science® Google Scholar
Lowry, D. B., Hoban, S., Kelley, J. L., Lotterhos, K. E., Reed, L. K., Antolin, M. F., & Storfer, A. (2016). Breaking RAD: An evaluation of the utility of restriction site associated DNA sequencing for genome scans of adaptation. Molecular Ecology Resources, 17, 142–152.
10.1111/1755-0998.12635
CAS PubMed Web of Science® Google Scholar
Malinsky, M., Svardal, H., Tyers, A. M., Miska, E. A., Genner, M. J., Turner, G. F., & Durbin, R. (2017). Whole genome sequences of Malawi cichlids reveal multiple radiations interconnected by gene flow. BioRxiv, 143859. https://doi.org/10.1101/143859
Google Scholar
Mallet, J., Besansky, N., & Hahn, M. W. (2016). How reticulated are species? BioEssays, 38, 140–149. https://doi.org/10.1002/bies.201500149
10.1002/bies.201500149
PubMed Web of Science® Google Scholar
Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal, 17, 10. https://doi.org/10.14806/ej.17.1.200
10.14806/ej.17.1.200
PubMed Google Scholar
McGaugh, S. E., Gross, J. B., Aken, B., Blin, M., Borowsky, R., Chalopin, D., … Warren, W. C. (2014). The cavefish genome reveals candidate genes for eye loss. Nature Communications, 5, 5307. https://doi.org/10.1038/ncomms6307
10.1038/ncomms6307
CAS PubMed Web of Science® Google Scholar
McGaugh, S. E., & Noor, M. A. F. (2012). Genomic impacts of chromosomal inversions in parapatric Drosophila species. Philosophical Transactions of the Royal Society B: Biological Sciences, 367, 422–429. https://doi.org/10.1098/rstb.2011.0250
10.1098/rstb.2011.0250
PubMed Web of Science® Google Scholar
McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., … DePristo, M. A. (2010). The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20, 1297–1303. https://doi.org/10.1101/gr.107524.110
10.1101/gr.107524.110
CAS PubMed Web of Science® Google Scholar
Meier, J. I., Marques, D. A., Mwaiko, S., Wagner, C. E., Excoffier, L., & Seehausen, O. (2017). Ancient hybridization fuels rapid cichlid fish adaptive radiations. Nature Communications, 8, 14363. https://doi.org/10.1038/ncomms14363
10.1038/ncomms14363
CAS PubMed Web of Science® Google Scholar
Meyer, A., & Van de Peer, Y. (2005). From 2R to 3R: Evidence for a fish-specific genome duplication (FSGD). BioEssays, 27, 937–945. https://doi.org/10.1002/(ISSN)1521-1878
10.1002/bies.20293
CAS PubMed Web of Science® Google Scholar
Mitchell, R. W., Russell, W. H., & Elliott, W. R. (1977). Mexican eyeless characin fishes, genus Astyanax: Environment, distribution, and evolution. Lubbock, TX: Texas Tech Press.
Google Scholar
Moran, D., Softley, R., & Warrant, E. J. (2014). Eyeless Mexican cavefish save energy by eliminating the circadian rhythm in metabolism. PLoS One, 9, e107877. https://doi.org/10.1371/journal.pone.0107877
10.1371/journal.pone.0107877
PubMed Web of Science® Google Scholar
Nair, S., Williams, J. T., Brockman, A., Paiphun, L., Mayxay, M., Newton, P. N., … Anderson, T. J. (2003). A selective sweep driven by pyrimethamine treatment in southeast asian malaria parasites. Molecular Biology and Evolution, 20, 1526–1536. https://doi.org/10.1093/molbev/msg162
10.1093/molbev/msg162
CAS PubMed Web of Science® Google Scholar
Nei, M. (1987). Molecular evolutionary genetics. New York, NY: Columbia University Press.
10.1111/j.1365-294X.2006.02908.x
Google Scholar
Noor, M. A. F., & Bennett, S. M. (2009). Islands of speciation or mirages in the desert? Examining the role of restricted recombination in maintaining species. Heredity, 103, 439–444. https://doi.org/10.1038/hdy.2009.151
10.1038/hdy.2009.151
CAS PubMed Web of Science® Google Scholar
O'Quin, K. E., Doshi, P., Lyon, A., Hoenemeyer, E., Yoshizawa, M., & Jeffery, W. R. (2015). Complex evolutionary and genetic patterns characterize the loss of scleral ossification in the blind cavefish Astyanax mexicanus. PLoS One, 10, e0142208. https://doi.org/10.1371/journal.pone.0142208
10.1371/journal.pone.0142208
PubMed Web of Science® Google Scholar
O'Quin, K., & McGaugh, S. E. (2015). The genetic bases of troglomorphy in Astyanax: How far we have come and where do we go from here? In A. Keene, M. Yoshizawa & S. E. McGaugh (Eds.), Biology and evolution of the Mexican cavefish (pp. 111–136). Boston, MA: Elsevier.
Google Scholar
O'Quin, K. E., Yoshizawa, M., Doshi, P., & Jeffery, W. R. (2013). Quantitative genetic analysis of retinal degeneration in the blind cavefish Astyanax mexicanus. PLoS One, 8, e57281. https://doi.org/10.1371/journal.pone.0057281
10.1371/journal.pone.0057281
CAS PubMed Web of Science® Google Scholar
Ornelas-García, C. P., Domínguez-Domínguez, O., & Doadrio, I. (2008). Evolutionary history of the fish genus Astyanax Baird & Girard (1854)(Actinopterygii, Characidae) in Mesoamerica reveals multiple morphological homoplasies. BMC Evolutionary Biology, 8, 340. https://doi.org/10.1186/1471-2148-8-340
10.1186/1471-2148-8-340
CAS PubMed Google Scholar
Ornelas-García, C. P., & Pedraza-Lara, C. (2015). Phylogeny and evolutionary history of A. mexicanus. In A. C. Keene, M. Yoshizawa & S. E. McGaugh (Eds.), Biology and evolution of the Mexican cavefish (pp. 77–92). San Diego, CA: Elsevier.
Google Scholar
Page, L. M., & Burr, B. M. (2011). Peterson field guide to freshwater fishes of North America north of Mexico. Boston, MA: Houghton Mifflin Harcourt.
Google Scholar
Panaram, K., & Borowsky, R. (2005). Gene flow and genetic variability in cave and surface populations of the Mexican tetra, Astyanax mexicanus (Teleostei: Characidae). Copeia, 2005, 409–416. https://doi.org/10.1643/CG-04-068R1
10.1643/CG-04-068R1
Google Scholar
Paradis, E., Claude, J., & Strimmer, K. (2004). APE: Analyses of phylogenetics and evolution in R language. Bioinformatics, 20, 289–290. https://doi.org/10.1093/bioinformatics/btg412
10.1093/bioinformatics/btg412
CAS PubMed Web of Science® Google Scholar
Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y., … Reich, D. (2012). Ancient admixture in human history. Genetics, 192, 1065–1093. https://doi.org/10.1534/genetics.112.145037
10.1534/genetics.112.145037
PubMed Web of Science® Google Scholar
Payseur, B. A., & Rieseberg, L. H. (2016). A genomic perspective on hybridization and speciation. Molecular Ecology, 25, 2337–2360. https://doi.org/10.1111/mec.13557
10.1111/mec.13557
CAS PubMed Web of Science® Google Scholar
Pease, J. B., & Hahn, M. W. (2015). Detection and polarization of introgression in a five-taxon phylogeny. Systematic Biology, 64, 651–662. https://doi.org/10.1093/sysbio/syv023
10.1093/sysbio/syv023
CAS PubMed Web of Science® Google Scholar
Pease, J., & Rosenzweig, B. (2015). Encoding data using biological principles: The Multisample Variant Format for phylogenomics and population genomics. IEEE/ACM Transactions on Computational Biology and Bioinformatics(TCBB) 15.4 (2018), 1231–1238.
Google Scholar
Peter, B. M. (2016). Admixture, population structure, and F-statistics. Genetics, 202, 1485–1501. https://doi.org/10.1534/genetics.115.183913
10.1534/genetics.115.183913
CAS PubMed Web of Science® Google Scholar
Pickrell, J. K., & Pritchard, J. K. (2012). Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genetics, 8, e1002967. https://doi.org/10.1371/journal.pgen.1002967
10.1371/journal.pgen.1002967
CAS PubMed Web of Science® Google Scholar
Porter, M. L., Dittmar, K., & Pérez-Losada, M. (2007). How long does evolution of the troglomorphic form take? Estimating divergence times in Astyanax mexicanus. Acta Carsologica, 36, 173–182.
10.3986/ac.v36i1.219
Web of Science® Google Scholar
Protas, M., Conrad, M., Gross, J. B., Tabin, C., & Borowsky, R. (2007). Regressive evolution in the Mexican cave tetra, Astyanax mexicanus. Current Biology, 17, 452–454. https://doi.org/10.1016/j.cub.2007.01.051
10.1016/j.cub.2007.01.051
CAS PubMed Web of Science® Google Scholar
Protas, M. E., Hersey, C., Kochanek, D., Zhou, Y., Wilkens, H., Jeffery, W. R., … Tabin, C. J. (2006). Genetic analysis of cavefish reveals molecular convergence in the evolution of albinism. Nature Genetics, 38, 107–111. https://doi.org/10.1038/ng1700
10.1038/ng1700
CAS PubMed Web of Science® Google Scholar
Protas, M., & Jeffery, W. R. (2012). Evolution and development in cave animals: From fish to crustaceans. WIREs Developmental Biology, 1, 823–845. https://doi.org/10.1002/wdev.61
10.1002/wdev.61
Web of Science® Google Scholar
Protas, M., Tabansky, I., Conrad, M., Gross, J. B., Vidal, O., Tabin, C. J., & Borowsky, R. (2008). Multi-trait evolution in a cave fish, Astyanax mexicanus. Evolution and Development, 10, 196–209. https://doi.org/10.1111/j.1525-142X.2008.00227.x
10.1111/j.1525-142X.2008.00227.x
CAS PubMed Web of Science® Google Scholar
R Core Team (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/
Google Scholar
Reich, D., Thangaraj, K., Patterson, N., Price, A. L., & Singh, L. (2009). Reconstructing Indian population history. Nature, 461, 489–494. https://doi.org/10.1038/nature08365
10.1038/nature08365
CAS PubMed Web of Science® Google Scholar
Requena, T., Cabrera, S., Martin-Sierra, C., Price, S. D., Lysakowski, A., & Lopez-Escamez, J. A. (2014). Identification of two novel mutations in FAM136A and DTNA genes in autosomal-dominant familial Meniere's disease. Human Molecular Genetics, 24, 1119–1126.
10.1093/hmg/ddu524
PubMed Web of Science® Google Scholar
Richards, E. J., & Martin, C. H. (2017). Adaptive introgression from distant Caribbean islands contributed to the diversification of a microendemic adaptive radiation of trophic specialist pupfishes. PLoS Genetics, 13, e1006919. https://doi.org/10.1371/journal.pgen.1006919
10.1371/journal.pgen.1006919
PubMed Web of Science® Google Scholar
Riddle, M. R., Aspiras, A. C., Gaudenz, K., Peuß, R., Sung, J. Y., Martineau, B., … Rohner, N. (2018). Insulin resistance in cavefish as an adaptation to a nutrient-limited environment. Nature, 555, 647. https://doi.org/10.1038/nature26136
10.1038/nature26136
CAS PubMed Web of Science® Google Scholar
Rieseberg, L. H., & Burke, J. M. (2001). The biological reality of species: Gene flow, selection, and collective evolution. Taxon, 50, 47–67. https://doi.org/10.2307/1224511
10.2307/1224511
Web of Science® Google Scholar
Ritz, K. R., & Noor, M. A. (2016). Mistaken identity: Another bias in the use of relative genetic divergence measures for detecting interspecies introgression. PLoS One, 11, e0165032. https://doi.org/10.1371/journal.pone.0165032
10.1371/journal.pone.0165032
PubMed Web of Science® Google Scholar
Roesti, M., Gavrilets, S., Hendry, A. P., Salzburger, W., & Berner, D. (2014). The genomic signature of parallel adaptation from shared genetic variation. Molecular Ecology, 23, 3944–3956. https://doi.org/10.1111/mec.12720
10.1111/mec.12720
CAS PubMed Web of Science® Google Scholar
Rosenblum, E. B., Parent, C. E., & Brandt, E. E. (2014). The molecular basis of phenotypic convergence. Annual Review of Ecology, Evolution, and Systematics, 45, 203–226. https://doi.org/10.1146/annurev-ecolsys-120213-091851
10.1146/annurev-ecolsys-120213-091851
Web of Science® Google Scholar
Rougemont, Q., Gagnaire, P.-A., Perrier, C., Genthon, C., Besnard, A.-L., Launey, S., & Evanno, G. (2017). Inferring the demographic history underlying parallel genomic divergence among pairs of parasitic and nonparasitic lamprey ecotypes. Molecular Ecology, 26, 142–162. https://doi.org/10.1111/mec.13664
10.1111/mec.13664
CAS PubMed Web of Science® Google Scholar
Rougeux, C., Bernatchez, L., & Gagnaire, P.-A. (2017). Modeling the multiple facets of speciation-with-gene-flow toward inferring the divergence history of lake whitefish species pairs (Coregonus clupeaformis). Genome Biology and Evolution, 9, 2057–2074. https://doi.org/10.1093/gbe/evx150
10.1093/gbe/evx150
PubMed Web of Science® Google Scholar
Salin, K., Voituron, Y., Mourin, J., & Hervant, F. (2010). Cave colonization without fasting capacities: An example with the fish Astyanax fasciatus mexicanus. Comparative Biochemistry and Physiology Part A: Molecular & Integrative Physiology, 156, 451–457. https://doi.org/10.1016/j.cbpa.2010.03.030
10.1016/j.cbpa.2010.03.030
CAS PubMed Web of Science® Google Scholar
Schlenke, T. A., & Begun, D. J. (2004). Strong selective sweep associated with a transposon insertion in Drosophila simulans. Proceedings of the National Academy of Sciences of the United States of America, 101, 1626–1631. https://doi.org/10.1073/pnas.0303793101
10.1073/pnas.0303793101
CAS PubMed Web of Science® Google Scholar
Slatkin, M. (2005). Seeing ghosts: The effect of unsampled populations on migration rates estimated for sampled populations. Molecular Ecology, 14, 67–73.
10.1111/j.1365-294X.2004.02393.x
CAS PubMed Web of Science® Google Scholar
Stamatakis, A. (2014). RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30, 1312–1313. https://doi.org/10.1093/bioinformatics/btu033
10.1093/bioinformatics/btu033
CAS PubMed Web of Science® Google Scholar
Stern, D. L. (2013). The genetic causes of convergent evolution. Nature Reviews Genetics, 14, 751. https://doi.org/10.1038/nrg3483
10.1038/nrg3483
CAS PubMed Web of Science® Google Scholar
Stern, D. L., & Orgogozo, V. (2009). Is genetic evolution predictable? Science, 323, 746–751. https://doi.org/10.1126/science.1158997
10.1126/science.1158997
CAS PubMed Web of Science® Google Scholar
Strecker, U., Bernatchez, L., & Wilkens, H. (2003). Genetic divergence between cave and surface populations of Astyanax in Mexico (Characidae, Teleostei). Molecular Ecology, 12, 699–710. https://doi.org/10.1046/j.1365-294X.2003.01753.x
10.1046/j.1365-294X.2003.01753.x
CAS PubMed Web of Science® Google Scholar
Strecker, U., Faúndez, V. H., & Wilkens, H. (2004). Phylogeography of surface and cave Astyanax (Teleostei) from Central and North America based on cytochrome b sequence data. Molecular Phylogenetics and Evolution, 33, 469–481. https://doi.org/10.1016/j.ympev.2004.07.001
10.1016/j.ympev.2004.07.001
CAS PubMed Web of Science® Google Scholar
Strecker, U., Hausdorf, B., & Wilkens, H. (2012). Parallel speciation in Astyanax cave fish (Teleostei) in Northern Mexico. Molecular Phylogenetics and Evolution, 62, 62–70. https://doi.org/10.1016/j.ympev.2011.09.005
10.1016/j.ympev.2011.09.005
PubMed Web of Science® Google Scholar
Tine, M., Kuhl, H., Gagnaire, P.-A., Louro, B., Desmarais, E., Martins, R. S. T., … Reinhardt, R. (2014). European sea bass genome and its variation provide insights into adaptation to euryhalinity and speciation. Nature Communications, 5, 5770. https://doi.org/10.1038/ncomms6770
10.1038/ncomms6770
CAS PubMed Web of Science® Google Scholar
Van Belleghem, S., Vangestel, C., De Wolf, K., De Corte, Z., Moest, M., Rastas, P., … Hendrickx, F. (2018). Evolution at two time frames: Polymorphisms from an ancient singular divergence event fuel contemporary parallel evolution. BioRxiv, 255554. https://doi.org/10.1101/255554
Google Scholar
Varatharasan, N., Croll, R. P., & Franz-Odendaal, T. (2009). Taste bud development and patterning in sighted and blind morphs of Astyanax mexicanus. Developmental Dynamics, 238, 3056–3064. https://doi.org/10.1002/dvdy.22144
10.1002/dvdy.22144
CAS PubMed Web of Science® Google Scholar
Wagenmakers, E.-J., & Farrell, S. (2004). AIC model selection using Akaike weights. Psychonomic Bulletin & Review, 11, 192–196. https://doi.org/10.3758/BF03206482
10.3758/BF03206482
PubMed Web of Science® Google Scholar
Welch, J. J., & Jiggins, C. D. (2014). Standing and flowing: The complex origins of adaptive variation. Molecular Ecology, 23, 3935–3937. https://doi.org/10.1111/mec.12859
10.1111/mec.12859
PubMed Web of Science® Google Scholar
Whiteley, A. R., Fitzpatrick, S. W., Funk, W. C., & Tallmon, D. A. (2015). Genetic rescue to the rescue. Trends in Ecology & Evolution, 30, 42–49. https://doi.org/10.1016/j.tree.2014.10.009
10.1016/j.tree.2014.10.009
PubMed Web of Science® Google Scholar
Wilkens, H. (1971). Genetic interpretation of regressive evolutionary processes: Studies on hybrid eyes of two Astyanax cave populations (Characidae, Pisces). Evolution, 25, 530–544. https://doi.org/10.2307/2407352
10.2307/2407352
PubMed Web of Science® Google Scholar
Wilkens, H., & Strecker, U. (2003). Convergent evolution of the cavefish Astyanax (Characidae, Teleostei): Genetic evidence from reduced eye-size and pigmentation. Biological Journal of the Linnean Society, 80, 545–554. https://doi.org/10.1111/j.1095-8312.2003.00230.x
10.1111/j.1095-8312.2003.00230.x
Web of Science® Google Scholar
Wilkens, H., & Strecker, U. (2017). Evolution in the dark. Berlin, Germany: Springer. https://doi.org/10.1007/978-3-662-54512-6
10.1007/978-3-662-54512-6
Google Scholar
Wootton, J. C., Feng, X., Ferdig, M. T., Cooper, R. A., Mu, J., Baruch, D. I., … Su, X. (2002). Genetic diversity and chloroquine selective sweeps in Plasmodium falciparum. Nature, 418, 320–323. https://doi.org/10.1038/nature00813
10.1038/nature00813
CAS PubMed Web of Science® Google Scholar
Yamamoto, Y., Byerly, M. S., Jackman, W. R., & Jeffery, W. R. (2009). Pleiotropic functions of embryonic sonic hedgehog expression link jaw and taste bud amplification with eye loss during cavefish evolution. Developmental Biology, 330, 200–211. https://doi.org/10.1016/j.ydbio.2009.03.003
10.1016/j.ydbio.2009.03.003
CAS PubMed Web of Science® Google Scholar
Yang, Z., & Rannala, B. (2012). Molecular phylogenetics: Principles and practice. Nature Reviews Genetics, 13, 303–314. https://doi.org/10.1038/nrg3186
10.1038/nrg3186
CAS PubMed Web of Science® Google Scholar
Yoshizawa, M., Ashida, G., & Jeffery, D. E. (2012). Parental genetic effects in a cavefish adaptive behavior explain disparity between nuclear and mitochondrial DNA. Evolution, 66, 2975–2982. https://doi.org/10.1111/j.1558-5646.2012.01651.x
10.1111/j.1558-5646.2012.01651.x
PubMed Web of Science® Google Scholar
Yoshizawa, M., Gorički, Š., Soares, D., & Jeffery, W. R. (2010). Evolution of a behavioral shift mediated by superficial neuromasts helps cavefish find food in darkness. Current Biology, 20, 1631–1636. https://doi.org/10.1016/j.cub.2010.07.017
10.1016/j.cub.2010.07.017
CAS PubMed Web of Science® Google Scholar
Yoshizawa, M., Robinson, B. G., Duboué, E. R., Masek, P., Jaggard, J. B., O'Quin, K. E., … Keene, A. C. (2015). Distinct genetic architecture underlies the emergence of sleep loss and prey-seeking behavior in the Mexican cavefish. BMC Biology, 13, 1.
10.1186/s12915-015-0119-3
PubMed Web of Science® Google Scholar
Yoshizawa, M., Yamamoto, Y., O'Quin, K. E., & Jeffery, W. R. (2012). Evolution of an adaptive behavior and its sensory receptors promotes eye regression in blind cavefish. BMC Biology, 10, 108. https://doi.org/10.1186/1741-7007-10-108
10.1186/1741-7007-10-108
PubMed Web of Science® Google Scholar

Citing Literature

Volume27, Issue22

November 2018

Pages 4397-4416

Filename	Description
mec14877-sup-0001-Supinfo.docxWord document, 597 KB
mec14877-sup-0002-FigS6.pdfPDF document, 98.3 KB
mec14877-sup-0003-FigS7.pdfPDF document, 51.3 KB
mec14877-sup-0004-FigS8.pdfPDF document, 490.9 KB
mec14877-sup-0005-Table.xlsxMS Excel, 12.1 MB
mec14877-sup-0006-VideoS1.mp4MPEG-4 video, 420 KB
mec14877-sup-0007-VideoS2.mp4MPEG-4 video, 489.6 KB

The role of gene flow in rapid and repeated evolution of cave-related traits in Mexican tetra, Astyanax mexicanus

Abstract