Volume 67, Issue 2 pp. 388-402
Free Access

PHYLOGENETIC INFERENCE OF NUPTIAL TRAIT EVOLUTION IN THE CONTEXT OF ASYMMETRICAL INTROGRESSION IN NORTH AMERICAN DARTERS (TELEOSTEI)

Richard C. Harrington

Richard C. Harrington

Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut 06520

E-mail: [email protected]

Search for more papers by this author
Edgar Benavides

Edgar Benavides

Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut 06520

Search for more papers by this author
Thomas J. Near

Thomas J. Near

Department of Ecology and Evolutionary Biology and Peabody Museum of Natural History, Yale University, New Haven, Connecticut 06520

Search for more papers by this author
First published: 08 September 2012
Citations: 17

Abstract

Introgressive hybridization and incomplete lineage sorting complicate the inference of phylogeny, and available species-tree methods do not simultaneously account for these processes. Both hybridization and ancestral polymorphism have been invoked to explain divergent phylogenies inferred from different datasets for Stigmacerca, a clade of 11 North American darter species. Species of Stigmacerca are characterized by a mating system involving parental care with males guarding nesting territories and fertilized eggs. Males of four species of Stigmacerca develop egg-mimic nuptial structures on their second dorsal fins during the breeding season. Previous phylogenies suggest contrasting scenarios for the evolution of this nuptial trait. Using a combination of coalescent-based methods, we analyzed a dataset comprising a mitochondrial gene and 15 nuclear loci to estimate relationships and simultaneously test for introgressive hybridization. Our analyses identified several instances of interspecific gene flow involving both cytoplamsmic haplotypes and nuclear alleles. The new phylogeny was used to infer a single origin and recent loss of egg-mimic structures in Stigmacerca and led to the discovery of a phylogenetically distinct species. Our results highlight the limited strategies available to account for introgressive hybridization in the inference of species relationships and the likely effects of this process on reconstructing trait evolution.

Constructing a phylogenetic inference that most closely matches the true evolutionary relationships for a clade of species is a necessary first step toward analyses aimed at deciphering life-history evolution, morphological and ecological diversification, and the origin of current geographic distributions (e.g., Felsenstein 1985; Losos 1995). The development of analytical methods that incorporate multilocus genetic data in a coalescent framework has greatly improved our ability to rapidly add more characters for inferring the phylogenetic relationships of closely related species. As valuable as multilocus species-tree methods have become in phylogenetics, remaining weaknesses are that the most commonly used species-tree methods assume correct species delimitation, and all methods assume no gene flow between species subsequent to speciation (Kubatko et al. 2009; Heled and Drummond 2010; Yang and Rannala 2010; Harrington and Near 2012). In reality, these assumptions are often violated for recent evolutionary radiations where morphology is ambiguous and difficult to interpret, or where closely related species hybridize. In spite of the advancement of these multilocus methods, a robust analytical framework that simultaneously disentangles species-level relationships while accounting for incomplete lineage sorting and genetic introgression is lacking.

SPOTTAIL DARTERS AND MALE NUPTIAL TRAITS

The spottail darters (Stigmacerca), a subclade of 11 freshwater fish species within the species-rich Etheostoma (Percidae) radiation in North America, have long presented taxonomic challenges. Of particular interest among species of Stigmacerca is the development of nuptial structures, or egg-mimic structures on the second dorsal fins, in males of some species during breeding season (see Etheostoma neopterum in Fig. 1). These egg-mimic structures are hypothesized to enhance mate attraction during courtship displays (Page and Bart 1989). Separate phylogenetic hypotheses inferred from morphological characters and mitochondrial DNA sequence data have resulted in very different phylogenies for the clade, which result in different reconstructed patterns of nuptial trait evolution (Page et al. 1992; Porterfield et al. 1999). Mitochondrial DNA introgression and incomplete lineage sorting have been invoked to account for the differences in phylogenies inferred from mtDNA and external morphological characters (Porterfield et al. 1999), but these explanations remain untested.

Details are in the caption following the image

Geographic distributions of species of Stigmacerca with inset map of the United States showing area covered in the maps. Examples of nuptial condition males are shown for Etheostoma crossopterum, E. forbesi, and E. neopterum.

Species of Stigmacerca live in headwater streams and are distributed primarily in the Tennessee and Cumberland River systems, but three species are distributed in tributaries of the Ohio and Mississippi rivers (Fig. 1). In contrast to most species of Etheostoma, species of Stigmacerca lack bright coloration, and morphological identification of females, nonnuptial males, and juveniles is difficult. Taxonomy and species identification in Stigmacerca is based largely on subtle differences in yellow (or white) and brown banding patterns on the second dorsal and caudal fins of nuptial-condition males (Page et al. 1992). The low magnitude of morphological disparity among species of Stigmacerca had resulted in long-standing unrecognized species diversity and confusion regarding species delimitation. Until the late 1970s only one species was recognized, E. squamiceps, and nine species were described in the time between 1978 and 1992 (Howell and Dingerkus 1978; Braasch and Mayden 1985; Page et al. 1992).

Darters (Percidae: Etheostomainae) are a clade of 248 freshwater percomorph fishes endemic to eastern North America (Near et al. 2011), and exhibit variation in reproductive strategies (Kelly et al. 2012). These mating systems include relatively simple behaviors such as egg scattering and clumping and more complex patterns of paternal nest and egg guarding (Page 1985; Kelly et al. 2012). Species of Stigmacerca exhibit a derived behavior of male nest-guarding and paternal care, where nuptial males guard nesting territories in cavities under rocks or other debris in the stream. Males of four species of Stigmacerca (E. chienense, E. neopterum, E. oophylax, and E. pseudovulatum) develop extended fin rays on the second dorsal fin during breeding season in March through May. With the exception of E. chienense, these species also produce prominent yellowish colored knobs at the end of these dorsal fin ray extensions (Page et al. 1992). The distal tips of the second dorsal fin of E. squamiceps males can become fleshy and swollen during breeding season, but these species lack development of stalk or knobs (Mayden 1985; Page and Knouft 2000). No empirical studies have been conducted to demonstrate the function of male nuptial second dorsal fin protuberances in species of Stigmacerca, but it is hypothesized that they are egg-mimics, which are used by the males to court females and take advantage of female preference for nests with large numbers of eggs (Knapp and Sargent 1989; Page and Bart 1989). During courtship, the putative egg-mimicing structures, which are approximately the same size and color as fertilized eggs, are displayed in the general proximity of the nest's egg deposition site (Page 1974; Page and Knouft 2000). For the purposes of this study, we refer to the extended second dorsal fin rays and distal knob development in nuptial condition males of Stigmacerca species as egg-mimics, although we concede that the functions of these structures have not been tested experimentally.

Discordant phylogenetic relationships among species of Stigmacerca have been inferred from morpohological and molecular data. Page et al. (1992) used nine external morphological characters, including those relating to the egg-mimic structures on the second dorsal fin, and hypothesized a single origin of egg-mimics, with no reversals and an “intermediate” egg-mimic morphology in E. squamiceps and E. chienense. Phylogenies inferred from mtDNA sequences of the cytb gene indicated either multiple origins, or multiple losses of egg-mimic structures; however, Porterfield et al. (1999) were hesitant to suggest the cytb gene tree reflected the actual species relationships, citing that hybridization and incomplete lineage sorting may complicate the inference of Stigmacerca phylogeny.

HYBRIDIZATION, INTROGRESSION, AND PHYLOGENY INFERENCE

There is a growing awareness of the widespread role that hybridization has played in the evolutionary history of many lineages across the Tree of Life (Arnold 1996). Knowledge of an organism's hybrid status has relevance to any biological investigation. Most fundamentally, the discovery of hybridization can provide critical insights into the biology and nature of species’ boundaries (e.g., Mallet et al. 2007). Studies of museum specimens suggest that teleost fishes, and especially freshwater species, exhibit a high frequency of hybridization (Hubbs 1955). For instance, of the approximately 250 species of darters (Percidae: Etheostomatinae), Keck and Near (2009) documented hybrid specimens that involve 63 darter species. Experimental laboratory crosses between darter species have demonstrated the ability of distantly related species to produce viable hybrid offspring (Hubbs and Strawn 1957).

One of the difficulties presented by hybridization is that introgression could potentially lead to the inference of gene trees that do not reflect the actual phylogenetic relationships. Interestingly, mitochondrial genomes are more likely to cross species boundaries and become fixed for populations, or species, than nuclear-encoded loci (Funk and Omland 2003; Chan and Levin 2005). Instances of mitochondrial replacement between darter species have complicated phylogenetic inference at both shallow and deep evolutionary time scales (Bossu and Near 2009; Keck and Near 2010; Near et al. 2011). A general challenge in detecting and deciphering patterns of introgression is that many recently diverged species exhibit widespread allele sharing and relatively low genetic diversity at many nuclear loci, while the well-resolved mitochondrial phylogeny will more likely exhibit the degree of resolution required to identify the patterns resulting from introgression or mitochondrial capture, but not necessarily indicate the introgression of nuclear genomes.

Empirical and theoretical studies have shown the necessity of incorporating gene trees obtained from multiple unlinked loci to increase the probability of inferring the actual species relationships due to the stochastic nature of gene-lineage sorting through populations (Hudson 1992; Edwards 2009). The idiosyncratic lineage sorting of unlinked genetic loci is a factor that has gained substantial attention in phylogenetics, and an extensive analytical framework has developed to reconstruct phylogenies in the face of incomplete lineage sorting (Degnan and Rosenberg 2009; Edwards 2009; Kubatko et al. 2009; Heled and Drummond 2010). Although multilocus species-tree analyses are a more robust strategy to infer the correct phylogeny than traditional data concatenation (Eckert and Carstens 2008), there is a dearth of analytical methods that can simultaneously infer a phylogeny and identify introgression, especially in instances where parent species’ alleles are not sampled or no longer exist (Kubatko 2009). Currently, the most common phylogenetic approach for identifying instances of mitochondrial introgression is to compare relationships observed on a mitochondrial gene tree to that of nuclear gene trees (e.g., Bossu and Near 2009). Although introgression between species can complicate phylogeny inference, documenting instances of mitochondrial replacement or nuclear gene introgression can provide other important insights about the evolutionary history of the species involved; such as the geographic context of hybridization, the direction of gene flow, and the evolution of reproductive isolation.

Our objective in this study is to better understand the evolution of nuptial traits by testing the monophyly of the Stigmacerca species that exhibit egg-mimic structures. We confront the long-standing uncertainty in relationships among species of Stigmacerca through phylogenetic analyses that use broad sampling of all species in the clade using 15 nuclear genes and a Bayesian coalescent framework to infer a species tree. The species relationships inferred from nuclear and mitochondrial genomes are compared and used as a starting point to investigate instances of historical hybridization and incomplete lineage sorting among species of Stigmacerca. This approach avoids the pitfalls associated with relying entirely on inferences based on single-gene trees, and allows the identification of gene flow between species that may misspecify species-tree analyses, and provides an example for the phylogenetic study of trait evolution in the context of discordant gene tree and species-tree topologies.

Materials and Methods

SPECIMEN SAMPLING, DNA SEQUENCING, AND ALIGNMENT

Specimens of Stigmacerca species were collected from localities throughout their ranges in the Cumberland, Tennessee, Ohio, and Mississippi River systems. Locality information for all specimens used in this study is provided in Table S1. Specimens were anesthetized using MS-222 and tissue samples for DNA extraction were taken from right pectoral fins and catalogued in the Yale Fish Tissue Collection (YFTC). Voucher specimens were fixed in formalin, washed in water, transferred to 70% ethanol, and deposited in fish collection at the Yale Peabody Museum of Natural History (YPM). Tissue samples of E. forbesi were obtained from Hayden Mattingly, Tennessee Technological University, and E. chienense specimens were obtained from a captive population at Conservation Fisheries, Inc., Knoxville, TN.

The Qiagen DNeasy Tissue kit was used to isolate DNA from tissue biopsies following the manufacturer's protocol. The following genes were amplified using PCR primers and reaction conditions described in previous studies: mitochondrial cytochrome b (cytb) (Near et al. 2000), recombination activating gene 1 exon 3 (RAG1) (Lopez et al. 2004), and S7 ribosomal protein intron 1 (S7) (Chow and Hazama 1998), (KELCH) (Bossu and Near 2009), and 12 other nuclear protein coding genes published in (Li et al. 2007; Li et al. 2010). Sequences were obtained for the mitochondrial cytb gene from all sampled specimens, and the 15 nuclear genes were sequenced for a subset of approximately four to five individuals of each species (Table S1). The program GARD (available at the website http://www.datamonkey.org) was used to test for the presence of recombination at each locus (Kosakovsky Pond and Frost 2006; Kosakovsky Pond et al. 2006).

Amplified PCR products were purified using a polyethylene glycol precipitation protocol. Cleaned PCR products were used as template for Big Dye (Applied Biosystems, San Francisco, CA) cycle sequencing on an ABI 3100 automated sequencer at the Molecular Systematics and Conservation Genetics Laboratory (Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT). Mitochondrial cytb was sequenced in three segments using primers Glu2, Enig1F (5′-TGCATCCTTTTTCTTTATCTGC-3′), Enig1R (5′-GCAGATAAAGAAAAAGGATGCA-3′), ESQ2F (5′-CGCYACTCTCACTCGATTTTTC-3′), and ESQ1R (5′-AGGCGGTCATTATTACTAATAG-3′). Contiguous sequences were created and edited using Sequencher (GeneCodes, Ann Arbor, MI), and aligned manually using Se-Al version 2.0 (http://tree.bio.ed.ac.uk/software/seal/). Sequences are available at GenBank (http://www.ncbi.nlm.nih.gov/genbank/) under accession numbers JX547016-JX548098. Phylogenetic trees generated in this study are available at DRYAD (http://www.datadryad.org).

BAYESIAN INFERENCE OF GENE TREES

Phylogenies were estimated from each sampled locus using a partitioned Bayesian strategy executed in the computer program MrBayes 3.1.2 (Ronquist and Huelsenbeck 2003). A single data partition was used for each of the nuclear protein coding genes, and three partitions were used for the mitochondrial cytb gene based on codon position. Models of molecular evolution were chosen that best fit each of the data partitions using the Akaike information criterion (AIC) as implemented in the computer program MrModeltest version 2.3 (Nylander 2004) (Table 1). The optimal partitioning scheme for mitochondrial cytb was selected from the comparison of Bayes factors that were calculated from the log of the harmonic mean of the likelihood values sampled from the posterior distributions of the two compared MrBayes runs (Newton and Raftery 1994; Nylander 2004; Brandley et al. 2005). In addition to individual gene trees for each of the nuclear genes, an analysis was run using MrBayes in which all 15 of the nuclear genes were concatenated and each gene was treated as a separate data partition. Posterior trees and model parameters were sampled from MrBayes runs of 2.0 × 107 generations with four simultaneous chains in the analysis of each gene region. Burn-in was set at 2 × 106 generations, discarding all trees and parameter values sampled before the burn-in. Stationarity of the chains and convergence of the trees and parameter values were determined by plotting the likelihood score and all other model parameter values against the generation number using the computer program Tracer version 1.5 (Rambaut and Drummond 2003). Convergence of the MrBayes runs was also assessed by monitoring the average SD of the split frequencies between the two independent runs, assuming that stationarity of the chains was achieved when this value was less than 0.005 and by monitoring the potential scale reduction factors between independent runs. Effective sample size (ESS) values for model parameters were assessed to check for adequate mixing of the Markov chain Monte Carlo (MCMC) (ESS exceeded 200 for all model parameters in our analysis).

Table 1. Summary information for sampled loci.
Gene Base pairs Molecular evolutionary model
zic1 743 HKY+I+G
sh3-px3 763 HKY
ube3A 604 HKY+I
rag1 1353 HKY+I
znf503 1232 HKY+I
plag12 759 GTR+I
enc1 742 GTR+I+G
kelch 777 HKY+I
myh6 629 HKY+I
sidkey 1108 GTR+I
glyt 787 HKY+I
ube3A-like 591 GTR+I
tbr1 557 HKY
ptr 769 HKY+I
s7 594 HKY+I+G
cytb 1140 (1st codon) HKY+I
(2nd codon) HKY+G
(3rd codon) GTR+I+G

INFERENCE OF A TIME-CALIBRATED SPECIES TREE

We inferred a species tree of Stigmacerca species using *BEAST (Heled and Drummond 2010). *BEAST requires the a priori designation of individuals to species categories, and because our mitochondrial cytb gene tree showed significant intraspecific phylogeographic structure in E. nigripinne and nonmonophyly of E. oophylax, we treated each of those groups as separate taxonomic units in this analysis. We used sequence data from all nuclear loci sampled from 56 individuals that comprised four to five individuals per species, including the geographic subclades of E. nigripinne and E. oophylax identified in the mtDNA gene tree. Each gene was treated as a single partition, and we incorporated AIC-identified optimal molecular evolutionary models for each gene (Table 1). The species tree was calibrated using a strategy similar to that used in several investigations of divergence times in darter clades (Near and Benard 2004; Near and Keck 2005; Hollingsworth and Near 2009; Near et al. 2011), and the calibration of the species tree in *BEAST followed the methods outlined in McCormack et al. (2011). Divergence times were estimated using an uncorrelated lognormal (UCLN) model of molecular evolutionary rate heterogeneity (Drummond et al. 2006; Drummond and Rambaut 2007). A birth–death speciation prior was used for the branching rates in the phylogeny. The calibration priors consisted of five centrarchid fossil ages that were identified as producing internally consistent age estimates using a fossil cross-validation analysis (Near et al. 2005a). The fossil age estimates were treated as probability distribution based calibrations, using a lognormal distribution with a zero-point minimal bound reflecting the estimated geological age of the fossil (Ho 2007). Previous fossil cross-validation analyses and the temporal bounds of geological chrons and Land Mammal Age intervals associated with the formations bearing the fossils were used to estimate the uncertainty of the zero-point lower bound in the calibration age priors (Woodburne 2004; Near et al. 2005a). Information on the age, taxonomic identity, and specifics of lower and upper bound ages used in the calibration priors are given in Near et al. (2005a) and Hollingsworth and Near (2009). The phylogenetic placement of the taxa represented as fossils in the context of extant species of Centrarchidae reflected that discussed and used in Near et al. (2005a).

We conducted four *BEAST analyses that were each run for 2.0 × 108 million generations. The first 1.0 × 108 million trees were discarded as burn-in for each run, and after burn-in, every 8000th tree was saved. Clock models were unlinked across loci, and a birth–death branching model was used. Individual chains were combined using the computer program LogCombiner version 1.53 (http://beast.bio.ed.ac.uk/LogCombiner). We assessed whether parameter values for individual *BEAST runs had reached stationarity and convergence by visually assessing their trace plots in Tracer version 1.5, and whether ESS values for parameters exceeded 200 (Rambaut and Drummond 2003).

ESTIMATION OF A TIME-CALIBRATED mtDNA GENE TREE

The mitochondrial cytb gene tree was time calibrated using the same centrarchid fossils used in the *BEAST species-tree analysis of the nuclear loci. Divergence times were estimated using an UCLN model of molecular evolutionary rate heterogeneity using the Bayesian method in the computer program BEAST version 1.62 (Drummond et al. 2006; Drummond and Rambaut 2007). The BEAST analyses were run three times each and each run consisted of 8.0 × 107 generations for dating of the cytb gene tree. The resulting trees and parameter values from each run were combined using the computer program LogCombiner version 1.61 (http://beast.bio.ed.ac.uk/LogCombiner). Convergence of model parameter values and estimated ages of nodes to optimal posterior distributions were assessed by plotting the marginal posterior probabilities using Tracer version 1.5 (http://beast.bio.ed.ac.uk/Tracer). The posterior probability density of the combined tree files was summarized using the computer program TreeAnnotator version 1.5.3 (http://beast.bio.ed.ac.uk/TreeAnnotator). The mean and 95% highest posterior density (HPD) estimates of divergence times were visualized as a chronogram using the computer program FigTree version 1.2.3 (http://beast.bio.ed.ac.uk/FigTree). In an effort to assess the influence of the calibration priors on the posterior divergence time estimates, BEAST was also run using an empty alignment.

MEASURING GENE FLOW BETWEEN SPECIES

During the course of this study, we encountered what we interpret to be multiple instances of mitochondrial introgression among species of Stigmacerca. There is a dearth of analytical methods available to simultaneously measure hybridization and infer phylogenetic relationships of species, especially for instances where one of the parent species is no longer extant, or where mitochondrial introgression is complete and haplotypes of the recipient species are unsampled (Kubatko 2009). We investigated interspecific gene flow by calculating relative migration rates among species using the computer program IMa2 (Hey and Nielsen 2004, 2007; Hey 2010a). Sequences with multiple heterozygous single-nucleotide polymorphisms were resolved using the program PHASE version 2.1.1 (Stephens et al. 2001). SeqPHASE (Flot 2010) was used to generate input files for PHASE. IMa2 requires a user-specified phylogeny, and we used the topology inferred from the *BEAST species-tree analysis and only included species that were implicated as being involved in mitochondrial introgression: E. neopterum, E. nigripinne, and a population of E. nigripinne with introgressed mtDNA that we treat as a distinct taxa for this analysis. Etheostoma crossopterum was included in the IMa2 analyses to serve as a control, as it is sympatric in some localities with both E. neopterum and E. nigripinne, but analyses of gene trees does not indicate any pattern of hybridization between these species. Gene flow was also tested among E. crossopterum, E. olivaceum, and E. squamiceps, due to observed discordance between the mitochondrial gene tree and the nuclear-inferred species tree.

We performed preliminary IMa2 simulations to assess convergence of the MCMC chains on the data stationary distribution. Convergence of the MCMC was assessed by monitoring trend plots for splitting terms; checking mixing properties of the MCMC chains through ESS values; monitoring swapping rates between cold and heated chains; and comparing parameter estimates using genealogies sampled in the first and second halves of each run. For each analysis, burn-in was set to 120,000 steps, and a total of 100,000 genealogies were sampled to estimate joint posterior probability distributions of the migration parameters. Mutation rates for three genes (S7, Rag1, and mitochondrial cytb) were taken from Near et al. (2011) and used in IMa2 to estimate a geometric mean of the mutation rate based on the entire dataset. We applied an exponential prior distribution for migration rates to reflect our prior expectation of lower probabilities for high values of migration (Hey 2010a). We applied the test developed by Nielsen and Wakely (2001) and implemented in IMa2 to assess the significance of population migration rates between populations of interest. Because models that differ in the number of sampled populations also differ in the numbers of estimated parameters and the levels of correlation among them (Hey 2010b), we repeated analyses for single species pairs that indicated positive but nonsignificant gene flow over the same range of migration rate priors. For all cases, we ran three MCMC analyses with different starting seeds for further verification of convergence and parameter estimates.

BAYESIAN ANCESTRAL STATE RECONSTRUCTION OF EGG-MIMICS

Patterns of egg-mimic evolution were reconstructed on both the time-calibrated mitochondrial cytb gene tree and the nuclear multilocus species tree using the MCMC method in BayesMultistate module of the computer program BayesTraits (Pagel et al. 2004). Species were coded as either having egg-mimics or not. Four species were coded as exhibiting the egg-mimic character: E. oophylax, E. cf. oophylax (Clarks River population), E. neopterum, and E. pseudovulatum. Etheostoma chienense was coded as not exhibiting the egg-mimic phenotype. Rate deviation priors were adjusted to achieve an acceptance rate of approximately 0.20. Analyses were run to allow for adequate mixing and achievement of stationarity, in this case 5,000,000 generations, sampled every 100 generations, with a burn-in of 50,000.

Results

NUCLEAR GENE TREES

Unique patterns of coalescence were observed for each of the 15 nuclear loci examined (Fig. S1). Nuclear protein coding genes exhibited far lower maximum sequence divergence among species of Stigmacerca than observed for the S7 intron 1. The protein-coding genes exhibited between 0.8% and 2.4% maximum sequence divergence, whereas the S7 intron 1 was maximally 4.8% divergent between species. GARD analysis found evidence of recombination only in the S7 intron 1 dataset at position 392. Analyses using S7 intron 1 data were performed only using positions 1–392. There was widespread allele sharing among species, and Glyt was the only locus whose inferred gene tree resolved a monophyletic egg-mimic clade and a reciprocally monophyletic Ethesotma nigripinne. A MrBayes analysis of the concatenated nuclear genes resolved, with strong Bayesian posterior node support, all species as monophyletic, except for E. oophylax and E. nigripinne. A clade containing the egg-mimic species and E. chienense was resolved with strong Bayesian posterior support (Fig. 2A).

Details are in the caption following the image

Bayesian phylogenies of Stigmacerca inferred from (A) 15 concatenated nuclear genes, and (B) mitochondrial cytb. Posterior probability values are indicated next to each node, with values greater or equal to 0.95 represented marked with an asterisk. Egg-mimic species are enclosed in a box.

MITOCHONDRIAL GENE TREE

We collected mitochondrial cytb gene sequences for 271 Stigmacerca specimens sampled from all species in the clade. Bayesian phylogenetic analysis of these sequences resulted in a posterior set of trees very similar to the phylogenetic relationships presented in Porterfield et al. (1999), albeit with notable exceptions (Fig. 2B). First, with expanded geographic sampling across the range of E. nigripinne, we discovered significant phylogeographic structuring within this species that appears to comprise four monophyletic groups of populations that correspond to nonoverlapping geographic areas within the Tennessee and Duck River systems (Fig. 2). Three of these populations are: “upper Duck” (including Bigby Creek); “lower Duck and lower Tennessee” that comprise populations from the lower portion of the Duck River and Tennessee River Drainage upstream to Shoal Creek; and “middle Tennessee” population that comprise populations in the Elk River and upstream tributaries of the Tennessee River Drainage (Fig 1B). A fourth clade of E. nigripinne in the mtDNA gene tree was more closely related to a clade containing all sampled haplotypes from E. oophylax, E. cf. oophylax (Clarks River), and E. chienense (Fig. 2B). This clade of E. nigripinne comprised populations sampled from eight direct tributaries of the Tennessee River in northern Alabama and southern Tennessee (Fig. 1). Phylogenies inferred using the nuclear genes strongly supported the monophyly of all sampled populations of E. nigripinne, but these populations of E. nigripinne were monophyletic and sister to a clade containing all other sampled E. nigripinne and E. forbesi (Fig. 2A). This inferred phylogenetic relationship supports a hypothesis that these populations of E. nigripinne in northern Alabama and southern Tennessee may represent a distinct species (Boschung and Mayden 2004, p. 546). We interpret these populations of E. nigripinne that have mitochondrial haplotypes that are more closely related to an E. oophylax and E. chienense clade to represent an instance of mitochondrial introgression. Geographic ranges of the populations of E. nigripinne containing the introgressed mtDNA haplotypes and the E. oophylaxE. chienense clade are allopatric and geographically disjunct (Fig. 1).

Etheostoma neopterum, a species with egg-mimic structures, was resolved as the sister of E. nigripinne in the mtDNA-inferred phylogenies presented in Porterfield et al. (1999). Our gene tree with broader taxonomic sampling resulted in a resolution of E. neopterum within E. nigripinne (Fig. 2B). As discussed above, our inferences of phylogeny using the 15 nuclear genes resulted in the resolution of a clade containing the species with egg-mimic structures and E. chienense (Fig. 2A), and the strong Bayesian posterior support in the mitochondrial gene tree for the placement of E. neopterum within E. nigripinne suggests a second instance of mitochondrial introgression among species of Stigmacerca. The geographic range of E. neopterum is relatively small and entirely within the Shoal Creek system, a tributary of the Tennessee River (Fig. 1). In the mtDNA gene tree, E. neopterum mitochondrial haplotypes are sister to a clade containing haplotypes of E. nigripinne sampled from the lower Tennessee and lower Duck rivers (Fig. 2), which is geographically adjacent to Shoal Creek; however, the geographic ranges of both species are currently allopatric (Fig. 1).

SPECIES-TREE INFERENCE

Species-tree analyses performed using *BEAST with all sampled nuclear genes showed support, with lower posterior probability values than seen in the concatenated analysis, for monophyly of the egg-mimic clade that includes E. chienense, and strong support for monophyly of all sampled lineages of E. nigripinne (Fig. 3A). One difference between the *BEAST species tree and the Bayesian phylogeny inferred from the concatenated nuclear gene dataset is the placement of E. olivaceum. A majority of the *BEAST runs for the 15 gene dataset generated a set of posterior trees that showed weak posterior support for E. olivaceum resolved as the sister lineages of all other species of Stigmacerca. Some individual *BEAST runs resulted in a weakly supported clade containing E. olivaceum, E. squamiceps, and E. crossopterum, which is a result that is strongly supported in the Bayesian phylogeny inferred from the concatenated nuclear genes and the mitochondrial gene tree.

Details are in the caption following the image

Chronograms for (A) species tree inferred using 15 nuclear loci in * BEAST and (B) a mitochondrial cytb gene tree. Posterior ancestral states inferred using BayesTraits are indicated next to each node with pie-diagrams. Black represents posterior support for the egg mimic character state; white indicates posterior support for non-egg-mimic character state.

DISCOVERY OF A NEW SPECIES, THE CLARKS DARTER, E. cf. OOPHYLAX

An unexpected result from our phylogenetic analyses of mtDNA and nuclear genes involved the phylogenetic relationships of E. chienense and E. oophylax. These species are allopatric, with E. oophylax distributed in the lower Tennessee, from near the confluence of the Duck and Tennessee rivers and extending nearly to the confluence of the Tennessee and Ohio rivers. Etheostoma chienense is distributed in the Bayou de Chien, a small tributary of the Mississippi River (Fig. 1). Clarks River populations of E. oophylax differ meristically from other populations of E. oophylax, and differ from E. chienense in having full development of egg-mimic structures (Page et al. 1992). In the concatenated nuclear gene phylogeny and the mtDNA gene tree, all E. oophylax specimens from the Clarks River, the last major tributary of the Tennessee before its confluence with the Ohio River, were more closely related to E. chienense than to other E. oophylax populations (Fig. 2A, B). Mitochondrial cytb sequence divergence between Clarks River E. oophylax populations and E. chienense were less than 1%, whereas both species were approximately 3.5% divergent from the rest of the sampled E. oophylax populations. The species tree inferred from the 15 nuclear genes in *BEAST strongly supported the paraphyly of E. oophylax with high Bayesian posterior probability (Fig. 3A). The morphological differences and strongly supported concordance in phylogenies inferred from mitochondrial and nuclear genes that resolve E. oophylax from the Clarks River System and E. chienense as monophyletic supports the recognition of E. cf. oophylax, the Clarks Darter, as a distinct and undescribed species, under the phylogenetic species concept (Nixon and Wheeler 1990; Baum and Donoghue 1995). Future reanalysis of this group should consider the taxonomic importance of egg-mimic structures as well as other meristic and genetic data in making species delimitation decisions.

DIVERGENCE TIME ESTIMATES

Our relaxed-clock molecular divergence time analyses using external fossil calibrations in BEAST for the mitochondrial cytb sequence data resulted in posterior age estimates for the most recent common ancestor of all centrarchids that were similar to previously published age estimates for this clade (Near et al. 2005a,b; Hollingsworth and Near 2009). The mean posterior age estimate for the MRCA of Stigmacerca was 13.1 million years ago, 95% HPD: [9.4, 17.3] (Fig. 3B). The same node was slightly younger in the species-tree dating analysis, 9.7 million years ago, with HPD that overlaps with that calculated in the cytb dating analysis [7.8–11.3]. The estimated ages inferred from the mtDNA dataset suggest that these two mitochondrial introgression events observed in Stigmacerca occurred at different times. Mitochondrial haplotypes sampled from E. neopterum and the lower Tennessee River clade of E. nigripinne share common ancestry that dates to approximately 1.0 million years ago, 95% HPD [0.5, 1.3]. The node that resolved populations of E. nigripinne and E. oophylaxE. chienense as a clade had a mean age estimate of 4.4 million years ago, 95% HPD [3.1, 7.1].

Using the mitochondrial sequence data, the mean estimated divergence time between E. chienense and the Clarks Darter, E. cf. oophylax, was 0.17 million years ago, 95% HPD [0.1, 0.5], whereas the most recent common ancestor of both of these populations and other E. oophylax populations is 1.7 million years ago, 95% HPD [1.1, 3.3].

ANCESTRAL STATE RECONSTRUCTIONS

The patterns of egg-mimic evolution inferred from Bayesian ancestor state reconstructions were markedly different when performed on the mitochondrial gene tree and the species tree inferred using the 15 nuclear genes. In the reconstructions using the time-calibrated species tree, the deepest nodes in the phylogeny were reconstructed as a near-equal probability between the two states, but the node subtending the common ancestor of the clade containing E. neopterum, E. oophylax, E. cf. oophylax, E. neopterum, and E. chienense was reconstructed as an egg-mimic with high posterior probability (Fig. 3A). Across the entire phylogeny, there is a single origin of egg-mimics, and a single loss of this trait on the branch leading to E. chienenese. The pattern of trait evolution reconstructed on the cytb gene tree is more complicated and reflects the history of mtDNA introgression, with multiple transitions between egg-mimic and non-egg-mimic nodes that are expected with the nonmonophyly of egg-mimic species of Stigmacerca in the mtDNA gene tree (Fig. 3B).

PATTERNS OF GENE FLOW

The IMa2 coalescent analysis detected significant gene flow between populations of E. nigripinne and E. neopterum. Etheostoma nigripinne was treated as two taxonomic units in this analysis, one group that has mtDNA more closely related to E. oophylax, and all other samples of E. nigripinne. Significant levels of gene flow were detected from E. nigripinne to E. neopterum, from E. nigripinne into the population of E. nigripinne that has heterospecific mtDNA, and from E. neopterum into the population of E. nigripinne with heterospecific DNA (Table 2 and Fig. 4). The signal of gene flow from E. nigripinne into E. neopterum is driven entirely by the mitochondrial DNA. However, the signal of gene flow from E. neopterum to the E. nigripinne population with heterospecific mtDNA is not driven by the mitochondrial locus, but rather the shared alleles between these two species detected in six of the 15 sampled nuclear loci. There was no gene flow detected between E. crossopterum and either E. nigripinne populations or E. neopterum (Table 2). In analyses between E. crossopterum, E. olivaceum, and E. squamiceps, there were no significant levels of gene flow detected (results not shown). IMa2 estimates of divergence times among taxa were similar to those estimated by BEAST analyses (Table S2).

Table 2. Population migration rates (2NM) between species or populations estimated using IMa2 analyses. The IMa2 analysis was run for the entire dataset, including all sampled nuclear and mitochondrial sequences. Values that are significant at the P < 0.05 level are denoted by asterisk.
˙ E. nigripinne E. nigripinne (northern Alabama–southern Tennessee) E. neopterum E. crossopterum
E. nigripinne - 0.19* 0.04* 0.00
E. nigripinne (northern Alabama–southern Tennessee) 0.08 - 0.04 0.01
E. neopterum 0.00 0.08* - 0.00
E. crossopterum 0.00 0.00 0.00 -
Details are in the caption following the image

Estimated marginal posterior density distributions of population migration rates (2NM) with statistically significant likelihood ratio test scores, for populations of Etheostoma nigripinne and E. neopterum. Etheostoma nigripinne was treated as two populations in this analysis; those with heterospecific mitochondrial DNA in southern Tennessee–northern Alabama, and those without heterospecific mtDNA. Curves were generated using 15 nuclear loci and one mitochondrial locus, and exponential migration priors.

Discussion

NUPTIAL TRAIT EVOLUTION

Phylogenetic analyses of nuclear genes consistently supported the monophyly of the egg-mimic bearing Stigmacerca species and E. chienense (Figs. 2A, 3A). Although our conclusion of a single origin of egg-mimic structure is similar to that of previous analyses (Page et al. 1992; Porterfield et al. 1999), the phylogenetic perspective offered by this new molecular dataset is the basis of a revised hypothesis on the evolutionary history of egg-mimic structures in this clade. Previous phylogenetic analyses resolved E. squamiceps and E. chienense as successive sister lineages to a clade containing the other egg-mimic species, suggesting that the swollen tips of the second dorsal fin rays in these two species represented intermediate trait values in an evolutionary progression toward larger and more pronounced egg knobs (Page et al. 1992). In the phylogenetic analysis of the concatenated nuclear gene dataset and the nuclear gene inferred species tree, E. squamiceps is strongly supported as the sister lineage of E. crossopterum, and not closely related to the egg-mimic bearing clade of Stigmacerca species (Figs. 2A, 3A).

The phylogenetic placement of E. chienense nested in the egg-mimic bearing clade as the sister species of E. cf. oophylax, an egg-mimic species, suggests that its fin ray morphology is not an intermediate condition between non-egg-mimic and full egg-mimic, but rather egg-mimic structure loss in E. chienense is a derived character state (Fig 3A). Previous phylogenetic inferences based on morphology that resolved E. chienense as the sister lineage to all other egg-mimic Stigmacerca species were biased through the ordering of the egg-mimic character state transitions in maximum parsimony analyses (e.g., Page et al. 1992). Character ordering requires sufficient a priori knowledge about the character's evolution, for example, from ontogeny or out-group comparison, and not from the relative complexity of the character (Schuh 2000). When character state transitions were unordered, E. chienense was nested within a clade containing the egg-mimic Stigmacerca species (Page et al. 1992). Our BayesTraits analysis of nuptial trait evolution on the posterior distribution of Stigmacerca species tree strongly supported a single origin and single loss of the second dorsal fin egg-mimics in this clade (Fig. 3A).

SPECIES DISCOVERY

The discovery of the undescribed Clarks Darter, E. cf. oophylax, provides useful insight on the role of meristic and nuptial traits in alpha taxonomy of teleost fishes. Although Page et al. (1992) noted that Clarks River E. oophylax exhibited a higher modal second dorsal fin ray counts than all other populations of E. oophylax, they concluded it should remain classified as E. oophylax based on the similarity of strongly developed second dorsal fin egg-mimics in nuptial males and the number of white spots or stripes on the membrane between the dorsal fin rays in nuptial males, called windows. In fact, Clarks River E. oophylax share the same modal number of second dorsal fin rays as its sister lineage, E. chienense, in our molecular phylogenies (Page et al. 1992). Nuptial traits, which are likely under strong sexual selection, may be maintained for long periods or lost rapidly depending on the context of selection pressures (Hansen 1997). Species of Stigmacerca typically use flat rocks in the stream to form a nest cavity; however, the substrate of a large fraction of E. chienense nests are on roots and other types of woody debris along undercut banks (Piller and Burr 1999). It is possible that the selection pressure on E. chienense to maintain prominent egg-mimics has rapidly diminished upon colonization of the Mississippi alluvial plain, which may provide a different ecological context for the visual signal of egg-mimics than is found in the ranges of E. oophylax and all other members of the egg-mimic Stigmacerca species. This pattern of rapid loss of a prominent nuptial character calls attention to the need for further experimental studies on the relationship between mate choice and male nuptial traits in this clade.

INTROGRESSION AND PHYLOGENY

Introgression of mitochondrial genomes is much more common in darters than previously thought, with heterospecific mitochondrial genomes occurring in up to 12.5% of darter species (Near et al. 2011). Instances of hybridization involving E. crossopterum, E. neopterum, and E. nigripinne, have been reported based on assessment of nuptial male morphology and allozyme data, often sampled where two species of Stigmacerca are sympatric (Page et al. 1992; Strange 2000). The phylogenetic trees resulting from analyses of the mtDNA cytb and the nuclear genes strongly suggest two separate instances of historical introgression that both involve E. neopterum and two different lineages of E. nigripinne. The result of this introgression is that at least two species of Stigmacerca, E. neopterum, and populations of E. nigripinne from the eastern portion of Shoal Creek and other small direct tributaries of the Tennessee River in northern Alabama and southern Tennessee (Fig. 1) carry mitochondrial genomes of a heterospecific origin. Etheostoma neopterum is distributed in portions of the Shoal Creek system in Tennessee and Alabama, and carries a mitochondrial haplotype that appears to have been recently derived from geographically proximal populations of lower Tennessee River E. nigripinne. There are two different lineages of E. nigripinne in the Shoal Creek System that are allopatric and neither is syntopic with E. neopterum (Fig. 1).

The differences in topology between the mitochondrial gene tree and nuclear-inferred species tree for E. crossopterum, E. olivaceum, and E. squamiceps present another possible example of mitochondrial introgression. Etheostoma olivaceum, which is geographically adjacent to E. crossopterum (Fig. 1A), is either sister to an E. crossopterumE. squamiceps clade, or sister to the entire Stigmacerca clade in the species tree (neither topology is strongly supported in the species-tree analysis), but is strongly supported as sister to E. crossopterum in the mitochondrial gene tree (2, 3). Our analyses did not find any statistically significant signal for gene flow among any combination of these three species. The IMa2 analysis would not be expected to detect gene flow at the mitochondrial locus because the sampled mitochondrial genomes of these species are reciprocally monophyletic, and with no signal of gene flow from nuclear loci, the differences in topology between mitochondrial and nuclear genes either represent another example of complete mitochondrial replacement with no nuclear introgression, or an example of gene tree discordance.

The lineage of E. nigripinne found in eastern Shoal Creek and small Tennessee River tributaries in northern Alabama and southern Tennessee carries mtDNA haplotypes that are phylogenetically related to those observed in egg-mimic species (Fig. 2B). Based on the phylogeny of Stigmacerca inferred in both the nuclear gene concatenated and species trees (2, 3), the origin of this introgressed set of mtDNA haplotypes may be E. neopterum, which itself is now fixed for a heterospecific mtDNA haplotype that originated from the lower Tennessee E. nigripinne lineage (Fig. 2B). Isolation with migration analyses detected gene flow of nuclear loci between E. neopterum and the lineage of E. nigripinne with E. neopterum mtDNA (significant in the direction of E. neopterum into these populations of E. nigripinne). The introgression of the nuclear genes between these two species was also evident in examination of the individual nuclear gene trees. This is the first observation of historical introgression of nuclear alleles in darters.

These instances of introgression among species of Stigmacerca provide insight into the potential importance of egg-mimic structures in mate recognition. The E. nigripinne lineage in northern Alabama and southern Tennessee that carries mtDNA haplotypes originating from E. neopterum is geographically disjunct from any species of the egg-mimic clade (Fig. 1). In relation to the egg-mimics, the mtDNA introgression has been bidirectional between egg-mimic-bearing species and lineages lacking egg-mimics. This suggests that the presence of egg-mimics does not necessarily enhance prezygotic reproductive isolation between species of Stigmacerca. Males of Stigmacerca species undergo dramatic transformation during breeding season; with the development of swollen head, darkened pigment patterns on the fins and body, and for several species the development of egg-mimic protuberances on the second dorsal fin (Page 1974). Although hypothesized to serve as a false signal of fitness to females during courtship displays, the exact role of the egg-mimic structures is unknown, and is perhaps only one signal in a larger suite of nuptial characteristics including body pigment, courtship behavior, acoustic communication (Johnston and Johnson 2000), and choice of nesting territory (Bandoli 1997) that likely play a role in female mate choice. These species-specific nuptial characteristics likely diverge among species while in allopatry, and the role they play in maintaining species boundaries after attaining secondary contact is unknown.

Current analytical methods do not allow for statistical identification of the timing of introgression events between species (Sousa et al. 2011; Strasburg and Rieseberg 2011). However, because all populations of Stigmacerca that exhibit introgression are currently allopatric (Fig. 1), it is clear that the observed gene flow between populations was historical and does not represent ongoing contemporaneous gene flow between species. Consistent with other observations of cytoplasmic genome introgression among a diverse array of plant and animal lineages (e.g., Chan and Levin 2005), as well as other instances of mtDNA introgression in darters (e.g., Bossu and Near 2009; Keck and Near 2010), there was no detectable gene flow of nuclear alleles from E. nigripinne into E. neopterum (Table 2). The relaxed-clock estimates of divergence times using mtDNA data indicated that the introgression of the mtDNA genome from the lower Tennessee River E. nigripinne lineage into E. neopterum occurred no more than approximately 1.1 million years ago, the date of the node subtending E. neopterum and the lower Tennessee and upper Duck River clades (Fig. 3B). Darter specimens of apparent hybrid origin are not uncommon in ichthyological collections (Keck and Near 2009), which is reflected in a high rate of mitochondrial introgression and replacement in darters (Bossu and Near 2009; Keck and Near 2010; Near et al. 2011). In spite of the frequency that hybrid specimens and introgressed mitochondrial genomes are encountered in darters, the historical introgression of nuclear alleles from E. neopterum into populations of E. nigripinne in northern Alabama and southern Tennessee represents the first example of nonmitochondrial introgression observed in species that are currently allopatric.

Conclusions

We find strong support for a single evolutionary origin of egg-mimic nuptial structures in Stigmacerca with a single loss, or reduction, in a clade of egg-mimic bearing species. By using a coalescent-based species-tree method, we were able to reconcile the differences between incongruent, yet strongly supported relationships in mitochondrial and nuclear gene trees. Although mitochondrial introgression has impeded previous efforts to determine the evolutionary relationships of Stigmacerca species and the evolutionary origins of egg-mimic structures, we believe that the history of hybridization that is recorded by the patterns of mtDNA haplotype and nuclear allele introgression provides evidence of gene flow between species of Stigmacerca and the potential role that nuptial structures play in courtship and prezygotic isolating barriers between species. Moreover, the phenomena of mitochondrial and nuclear introgression are decoupled, as introgression of nuclear alleles was detected in only one of the two mitochondrial introgression events. The discovery of unrecognized species diversity within Stigmacerca, for example, the Clarks Darter and appreciable phylogeographic structuring among lineages of E. nigripinne, highlights the need to continue efforts for species delimitation using combinations of molecular and phenotypic data (e.g., Harrington and Near 2012). Although our analyses do provide clarity to the pattern of nuptial trait evolution in spottail darters, these results draw attention to the need for more detailed study of the mechanistic role that egg-mimic structures play in the courtship of these species, particularly in regard to sexual selection and the potential role of environmental context of female mate choice in maintaining these signals.


Associate Editor: L. Kubatko

ACKNOWLEDGMENTS

We thank C. M. Bossu, G. S. Bradburd, J. G. Colosi, J. G. Garner, J. Glass, R. M. Harrington, P. R. Hollingsworth, B. P. Keck, and G. Watkins-Colwell for assistance with specimen collection. G. Watkins-Colwell provided assistance with museum collections. This work was supported by the National Science Foundation [grant numbers DEB-0716155 and DEB-1011328] and the Peabody Museum of Natural History, Yale University.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.