Volume 26, Issue 1 pp. 277-290
Special Issue: The Molecular Mechanisms of Adaptation and Speciation: Integrating Genomic and Molecular Approaches
Full Access

Molecular mechanisms of adaptation and speciation: why do we need an integrative approach?

Kelsey J. R. P. Byers

Kelsey J. R. P. Byers

Department of Systematic and Evolutionary Botany, University of Zurich, Zollikerstrasse 107, CH-8008 Zurich, Switzerland

Search for more papers by this author
Shuqing Xu

Corresponding Author

Shuqing Xu

Max Planck Institute for Chemical Ecology, Hans-Knöll-Straße 8, D-07745 Jena, Germany

Correspondence: Philipp Schlüter, Fax: +41 446348403; E-mail: [email protected], or Shuqing Xu,

Fax: +49 3641571102; E-mail: [email protected]

Search for more papers by this author
Philipp M. Schlüter

Corresponding Author

Philipp M. Schlüter

Department of Systematic and Evolutionary Botany, University of Zurich, Zollikerstrasse 107, CH-8008 Zurich, Switzerland

Correspondence: Philipp Schlüter, Fax: +41 446348403; E-mail: [email protected], or Shuqing Xu,

Fax: +49 3641571102; E-mail: [email protected]

Search for more papers by this author
First published: 27 May 2016
Citations: 20

Abstract

Understanding divergent adaptation and ecological speciation requires the synthesis of multiple approaches, including phenotypic characterization, genetics and genomics, realistic assessment of fitness and population genetic modelling. Current research in this field often approaches this problem from one of two directions: either a mechanistic approach—seeking to link phenotype, genotype and fitness, or a genomic approach—searching for signatures of divergence or selection across the genome. In most cases, these two approaches are not synthesized, and as a result, our understanding is incomplete. We argue that research in adaptation and evolutionary genetics needs to integrate these approaches for multiple reasons, including progress towards understanding the architecture and evolutionary history of adaptation and speciation loci, the ability to untangle linkage and pleiotropy, increased knowledge of mechanisms of genomic evolution and insights into parallel evolutionary events. Identifying the genetic underpinnings of adaptation and ecological speciation is not necessarily the end goal of research, but it is an integral part of understanding the evolutionary process. As a result, it is critical to utilize both genetic and genomic approaches. Challenges remain, particularly in nonmodel organisms and in our ability to synthesize results from multiple experimental systems. Nonetheless, advances in genetic and genomic techniques are increasingly available in a diverse array of systems, and the time is ripe to exploit the synthesis of these two approaches to increase our understanding of evolution.

Introduction

The concept of ecological speciation intimately links divergent selection and adaptation to the origin of different species. Divergent adaptation and ecological speciation (hereafter A&S) are the consequences of divergent selection imposed by the local environment, population demography and genetic architecture of traits that affect organismal fitness (Schluter 2001; Nosil 2012). Although much progress has been made in each of these areas, particularly due to recent sequencing advances, fewer attempts have been made to integrate these areas in one study system. This can be seen from the often divided A&S research programmes: (i) study of the molecular mechanisms of A&S, which focuses on identifying genetic and biochemical mechanisms underlying adaptation; and (ii) ecological speciation/adaptation genomics, which focuses on how natural selection drives divergence of specific genomic loci. While both molecular adaptation and speciation genomics approaches have provided valuable insight into the genetics underlying A&S, these two approaches are rarely combined: molecular adaptation studies of targeted loci are rarely expanded to a genome-wide and population scale, and genome scans for signatures of selection and divergence are rarely followed up by in-depth molecular work. Below, we argue that these two research programmes have their own strengths and limitations, but are complementary to each other in studying A&S. Therefore, we propose that integration of these two research areas is needed for a more complete and unbiased understanding of the processes of A&S. In addition, we identify the key challenges and opportunities that will arise for integrating approaches to the study of molecular and genomic A&S (Fig. 1).

Details are in the caption following the image
Current research foci and challenges in studying adaptation and speciation using molecular (magenta arrows) and genomic (green arrows) approaches. Light red boxes highlight the challenges in these research areas, as follows: Challenge 1, selection on traits might be dynamic and dependent on time; Challenge 2, the fitness effects of individual genes might be dependent on genomic background, such as epistatic interactions among genes; Challenge 3, different genes associated with traits that affect fitness might have different fitness effects, and these effects might additionally vary with environmental conditions; Challenge 4, candidate loci identified using adaptation/speciation genomics might be spurious, with significant outliers resulting from demographic changes, false positives or selective sweeps, rather than selection on the locus itself; and Challenge 5, even in model organisms (e.g. Arabidopsis thaliana, Drosophila melanogaster or Mus musculus), the function of a large proportion of genes remains unknown. Two remaining challenges of an integrative approach—Challenge 6, synthesizing results from multiple systems into a conceptual whole; and Challenge 7, combining evidence from molecular and genomic approaches—must also be addressed. When addressing these challenges, it will be important to keep several points in mind: first, the importance of the environment is paramount, and in natura approaches should be used where possible; second, in some cases, it may be possible to harness natural variation, particularly in less tractable systems; and third, it will remain critically important to functionally validate any candidate genes resulting from these approaches. The bottom circle depicts the integration of molecular and genomic approaches to studying adaptation and speciation, with some suggested future integrative approaches (blue dashed arrows). By linking natural selection, genome evolution, functional traits and gene and pathway function, we can achieve an integrated research programme.

Molecular adaptation and speciation genomics

The modern synthesis, in uniting genetics and evolution, suggested that evolutionary processes could have a known genetic basis and provided an impetus to identify the molecular basis of A&S (Kutschera & Niklas 2004). Today, the molecular study of adaptation ranges more broadly, including diverse molecular mechanisms of evolution at multiple levels. We consider this molecular study to include not just genes, but also transcriptional regulation of gene networks, biochemical pathways, the role of small RNAs and mechanisms by which environmental input is processed. Importantly, mechanistic study does not stop once ‘a gene’ has been found: we would like to know the effects, origins and interactions of different alleles and mutations (Barrett & Hoekstra 2011). In seeking to understand molecular adaptation, we strive to understand the interaction of genetics and evolution in both directions: first, what is the genetic basis underlying an organism's adaptation to its environment, and second, how do ecological and evolutionary forces act upon the organism's genetic information to result in adaptation?

Our approach to understanding molecular adaptation often consists of the following steps: (i) identifying adaptive traits which increase organismic fitness in various environments; (ii) identifying the genetic bases of these adaptive traits, typically using genetic mapping or a candidate gene approach; and (iii) measuring the fitness effects of variation in the identified candidate genes, alleles or mutations. This forward genetics approach, linking genotype–phenotype–fitness, represents the gold standard for studying molecular mechanisms of A&S, and tremendous progress is being made in applying this approach in both model and nonmodel systems (Barrett & Hoekstra 2011; Watt 2013).

Despite clear advantages of this approach, there are several limitations in each individual step and their integration. First, identifying truly adaptive traits remains challenging. As human beings, we are biased towards studying traits accessible to human senses (e.g. flower or coat colour), while neglecting those that are invisible or inaccessible to us, but may be more appealing to selective agents (e.g. UV patterning in Petunia flowers, Sheehan et al. 2016; or surface hydrocarbons on orchid flowers, Xu et al. 2012a; both attractive to insect pollinators). Additionally, natural selection on traits is often a dynamic process (Siepielski et al. 2009), thus selection coefficient estimation requires multiyear measurements and may not represent the true selective regime that shaped the evolution of the traits in the past. The effects of a trait on fitness may vary at different life-history stages and can be difficult to measure in many systems (e.g. survivorship in long-lived organisms) (Fig. 1: Challenge 1). Second, identifying the genetic architecture of adaptive traits using a mapping-based approach can be biased. The effect of genes on phenotypic traits can be highly dependent on genetic, genomic and environmental background (Beavis 1998; Otto & Jones 2000) (Fig. 1: Challenge 2), and thus, the identified candidate genes associated with an adaptive trait might be dependent on the choice of mapping population and environment. Mapping with artificial populations has the advantage of tractability, but in the forward genetics approach, mapping techniques are often applied outside the natural environmental context, leading to potential bias in applying mapping results to ecological functions. Additionally, these experiments are rarely followed up for evidence of selection on identified candidates. Gene functions and selection within the present environment and time may not reflect historical selection, gene function and previous fluctuating genomic and environmental context, and this context may be impossible to determine (Fig. 1: Challenge 1). As a result, it may be more useful to study incipient speciation events where lineage separation is not yet complete (Via 2009; Stankowski & Streisfeld 2015). In many cases, determining when adaptation occurred is itself a challenging task (but see Danley & Kocher 2001). Third, quantitative measurement of the fitness effects of variation in candidate genes is both challenging and time-consuming. In the case of speciation, understanding the direct effects of candidate genes on gene flow is important (Hendry & Taylor 2004; Morjan & Rieseberg 2004), but this can be difficult to quantify (Fig. 1: Challenge 3). It is also possible that selection may act on genes that have small phenotypic but large fitness effects, rather than more easily detectable genes with large phenotypic but small fitness effects. Finally, when integrating all information in one system, the same natural populations or those with a similar population make-up should be used for all three steps, as the fitness effects of traits and genes are highly dependent on their genetic and environmental background (Barrett & Hoekstra 2011); this can be especially challenging for biologically interesting nonmodel organisms.

In contrast to the forward genetics approach, the genomic adaptation approach (‘population genomics’ sensu Stinchcombe & Hoekstra 2008) aims to directly identify putative adaptive loci in the genome by searching for molecular signatures of selection between populations or closely related species (akin to a reverse genetics approach). Rather than considering the molecular basis of adaptation in just the context of one or a few genes or pathways, an ecological speciation/adaptation genomics approach allows us to query the genomic landscape of these processes; this process is aided by recent advances in ‘omics’ approaches, including the relative ease of sequencing and assembling genomes or transcriptomes of different populations and the ability to scan the diverse array of proteins and metabolites produced in specific conditions. The advantage of this approach is the ability to directly search for end products (or footprints) resulting from the A&S process, and it also provides an unbiased way of detecting loci that are under selection. For example, many genomic loci might be under selection due to specific environmental conditions or agents of selection that cannot be measured or replicated in current experimental set-ups (MacColl 2011), thus will likely not be identified through the forward genetics phenotype–genotype–fitness approach. Insights into historical population phenomena such as admixture and assortative mating are also possible with a genome-wide approach (Lexer et al. 2010). Furthermore, this approach also allows us to detect change above the gene level—for example recombination or inversion—which may be the main genetic basis of evolution in a variety of systems, as some authors have suggested (Coyne & Orr 2004; Fishman et al. 2013).

Currently, searching for loci under selection is often carried out by scanning the genome for outliers using Wright's fixation index (FST, Wright 1951; Beaumont 2005), but other metrics exist. These include Tajima's D (Tajima 1989), the Hudson–Kreitman–Aguadé (HKA) test (Hudson et al. 1987) or dN/dS (ω) ratios (e.g. Torrents et al. 2003), each of which comes with certain assumptions and potential pitfalls (Charlesworth 1998; Hughes 2007; Nei et al. 2010; Cruickshank & Hahn 2014). To address some of these issues, other indices have recently been developed, often based on FST (Meirmans & Hedrick 2011; Pannell & Fields 2014). These analyses often identify putative ‘islands’ or ‘continents’ of speciation, the size and number of which can be informative in understanding the process of speciation in a given system (Feder et al. 2012).

However, the advantage of directly searching for end products of A&S at the same time also leads to several limitations. In addition to selection, several other factors can also result in similar statistical patterns at genomic loci, such as population demographic changes, selective sweeps and divergence hitchhiking (Gompert et al. 2012; Via 2012) (Fig. 1: Challenge 4). Although optimized experimental design and different advanced population genomic modelling approaches have been developed to disentangle these different confounding effects (Cutter & Payseur 2013), it remains extremely challenging to identify genomic loci that truly contributed to local adaptation based only on population genomic studies (Flaxman et al. 2013), especially among genes located in a recent selective sweep region. Furthermore, many genomic loci that show signatures of selection may not have any known or substantiated functional annotations, especially in nonmodel organisms (Fig. 1: Challenge 5). This prevents further study of why and how those loci contribute to the A&S process, which is often the primary goal of evolutionary studies.

Benefits of an integrative approach

These molecular and genomic approaches are complementary to each other, and integrating them in one system will provide a more complete understanding of A&S processes as well as offering context to these approaches. The idea of combining molecular and genomic approaches has been proposed previously (Stinchcombe & Hoekstra 2008), but research doing so has so far been limited to only a few systems. For much of the research on the molecular basis of adaptation, which is often limited to a single-gene context, adding an adaptation genomic view will provide greater mechanistic understanding of how and why a given gene was (or was not) under selection (Box 1A). For example, a gene identified via linkage or association mapping may interact in a regulatory or biosynthetic network with other genes in the genome or may share common regulatory elements with linked loci—in other words, the genomic context may be critical to understanding the gene's function (Dewey 2011). Such linked genes may contribute to the process of A&S and will be missed with a focused molecular approach, whereas a selective sweep may be picked up with a genome-wide scan. Furthermore, combining molecular adaptation and genomics may facilitate rapid identification of causal mutations that contribute to adaptation. For example, Smadja et al. (2012) performed FST and πas tests on a set of 153 candidate genes potentially underlying pea aphid speciation and found 19 genes showing increased divergence across three populations; a larger set of genes and aphid populations has confirmed and expanded on this work (Eyres et al. 2016 in this issue). Functionally testing each individual gene is costly and time-consuming, yet it is only by assessing the function of the most promising individual genes that we can make firm conclusions about their contribution to adaptation and the physical properties that favoured specific variants (Harms & Thornton 2013). As a result, we encourage researchers and funding agencies to allocate resources to functional testing of highly significant candidate genes. In an ideal world, this would consist of functional testing within the system in question—for example via transgenesis or targeted mutation, or less precisely via high-resolution near-isogenic lines, targeting either coding or regulatory candidate sequences. Ideally, this would be performed in both directions to verify that candidate alleles are both necessary and sufficient to produce the phenotypic and selective effects. Currently, some questions are amenable to functional testing via techniques such as heterologous expression, although this carries some risks in interpretation (Kramer 2015). Other, detailed, molecular techniques can provide supporting evidence, but by themselves are inadequate to verify a candidate gene's function in vivo. With any functional test, measuring the effects on a variety of phenotypes—not just the trait in question—will be critical to assessing the full effect of a specific allele on the organism. Integrating an adaptation genomic approach may help to rapidly narrow down the focus to a small set of key genes that likely were targets of selection. Moreover, A&S genomics can also provide further insights on population demographic change, which is an important factor that influences the process of A&S (Städler et al. 2005).

Box 1A. Applying a genomic view to existing molecular studies of adaptation

Wing colour patterning in Heliconius butterflies

Butterflies in the genus Heliconius show extensive Müllerian mimicry in wing patterning, particularly between the distantly related H. erato and H. melpomene, which have multiple shared wing pattern variants (Kronforst & Papa 2015). Purifying selection within species for the presence of red on the forewing has been demonstrated, as has selection against foreign morphs, demonstrating the importance of wing patterning (Kronforst & Papa 2015). Starting from a genomic region identified in previous mapping experiments, Reed et al. (2011) used comparative transcriptomics to identify optix, a homeobox transcription factor, as the candidate gene underlying the red colour patterns in multiple Heliconius species. Hybrid butterflies showed strong association between optix sequence variation and colour pattern (Reed et al. 2011); within H. melpomene, colour races demonstrated elevated FST in the genomic region containing optix (Nadeau et al. 2012). Comparative developmental evidence suggests that the co-option of optix as a driver of colour patterning is a recent novelty within the Heliconiini (Martin et al. 2014).

Reproductive isolation between Mimulus lewisii and M. cardinalis (monkeyflowers)

In sympatry, 97% of the reproductive isolation between these two species is mediated by their animal pollinators (Ramsey et al. 2003). Quantitative trait loci for traits influencing pollinator behaviour (e.g. anthocyanin and carotenoid pigmentation, Bradshaw et al. 1995, 1998; Bradshaw & Schemske 2003; and floral scent, Byers et al. 2014) colocalize with those affecting pollinator efficiency (e.g. stamen and style length, Bradshaw et al. 1995, 1998; Fishman et al. 2013; and corolla length, Fishman et al. 2013) and pollen viability (Fishman et al. 2013). These clusters of loci affecting reproductive isolation between the two species are located within putatively rearranged regions of the genome (Fishman et al. 2013). Although it is unclear whether genome evolution processes initially drove reproductive isolation in this system, recombination suppression has served to maintain suites of traits maintaining isolation in sympatry (Box Fig.1).

Details are in the caption following the image
Mimulus lewisii (top) and M. cardinalis (bottom), sister species reproductively isolated largely by pollinator preference for divergent floral traits. Genes and QTL underlying these floral traits are clustered in areas of recombination suppression, maintaining distinct floral phenotypes. All images by Kelsey Byers.

Reproductive isolation between Petunia axillaris and P. exserta

The narrow endemic Petunia exserta hybridizes with the closely related P. axillaris due to pollen transfer by hummingbirds (Lorenz-Lemke et al. 2006). Several loci underlying traits responsible for differences in pollinator behaviour and efficiency have been identified in multiple Petunia species, for example scent (Klahre et al. 2011), anthocyanins (Hoballah et al. 2007) and UV absorbance (Sheehan et al. 2016), corolla area and tube length (Venail et al. 2010), and pistil and stamen length (Hermann et al. 2013). At least three loci representing five of these traits have been found to be in tight linkage in a region with potentially suppressed recombination between these two species (Hermann et al. 2013). As with Mimulus, this suggests a role for genome evolution underlying reproductive isolation between these two species, although the processes driving genome evolution are as yet unclear in this system. Further, characterization of these loci and any role of genomic evolution are needed; in particular, as with Mimulus, a genome-wide scan for signatures of selection by pollinators in the field would be a valuable addition to this system.

Similarly, ecological speciation genomics benefits from incorporating a more focused molecular view. Many current analyses of speciation genomics stop at the genome scan or annotation level, which may result in markers potentially linked to A&S, but not the actual genetic underpinnings of these processes or the identification of specific candidate speciation genes (if they exist) (Box 1B). In our opinion, there are a number of new insights that can be gained from identifying the molecular functions of candidate genes associated with A&S processes. (i) It can help to rule out false positives. Genome-wide scans often result in a large number of false positives, which prevents direct conclusions from being drawn (Lotterhos & Whitlock 2014). In particular, tightly linked genes may have similar signatures of differentiation, even if only one was the actual target of selection. Identifying the molecular function of these tightly linked genes and testing their fitness effects will help to identify the true targets of selection. By narrowing our focus down from the genomic region to individual genes and regulatory elements, it becomes easier to see whether the region likely contains legitimate targets of selection, rather than trusting the result of genomewide scans alone or choosing a ‘likely looking’ candidate gene and failing to verify its function. (ii) It will allow us to understand whether complex evolutionary scenarios are due to the linkage of several necessary functions (e.g. via linkage disequilibrium) or due to pleiotropy of a single gene (Wagner & Zhang 2011; Paaby & Rockman 2013). (iii) It can help us to understand the most significant driving forces that have shaped recent A&S processes. Loci showing signatures of selection often either have unknown functions or are only functionally characterized in model organisms. Identifying the exact molecular functions and the traits that the target gene is associated with can provide insight on which traits were targets of selection and what the key driving forces were. (iv) It will allow us to identify and analyse parallel evolution events (Colosimo et al. 2005; Rausher 2008; Manceau et al. 2010; Streisfeld & Rausher 2011; Conte et al. 2012). (v) It will provide new insights into the trajectory of A&S, which may often be associated with the likelihood and direction of trait transitions. This can be seen from the fact that trait transitions that involve gene or pathway degeneration caused by accumulation of loss-of-function mutations are likely to exist in a one-way direction only (e.g. Rausher 2008; Streisfeld & Rausher 2011). Therefore, identifying causal genes and mutations will allow modelling of the direction of A&S events (e.g. Weinreich et al. 2006; Franke et al. 2011; Xu & Schlüter 2015). (vi) It will allow us to identify other factors, such as genome evolution events (e.g. gene conversion, whole genome duplication and tandem duplications) that may influence the process of A&S (Volff 2005). (vii) It will allow us to more broadly understand the molecular nature of speciation loci, for example whether duplication allows evolutionary innovations, what effect sizes are necessary to promote the speciation process or whether certain gene classes are over- or underrepresented in the speciation process. Until we understand the molecular genetics behind A&S, it will be difficult to draw conclusions in these areas. Although the identification of the specific molecular basis of A&S is not necessarily the end goal in any given system (Rockman 2012), a complete understanding of the process of A&S requires that we understand its genetic underpinnings, and we argue that this requires an integrative approach.

Box 1B. Insights into genome-wide approaches from existing orthologous gene annotations

Cryptic coloration in Timema cristinae stick insects

Two host plant races of T. cristinae differ in their pigmentation patterning, which is under divergent selection, yet nonetheless maintain gene flow (Nosil et al. 2002; Nosil 2007). A third, melanistic morph, shows mating advantages and increased pathogen resistance and can maintain crypsis in stem microhabitats (Box Fig.2, central panel); modelling shows these melanistic individuals may be maintaining gene flow between the two races (Comeault et al. 2015). Genome-wide association mapping found a single locus responsible for 95% of the phenotypic variation in melanism; the most significant SNPs mapped to a putative cysteinyl-tRNA synthetase, an enzyme known to be required for melanin synthesis (Comeault et al. 2015). Further functional work will be required to verify the function of this gene in Timema but will allow more in-depth work on melanism in this and other Timema species (e.g. Comeault et al. 2016) (Box Fig.2).

Details are in the caption following the image
Timema cristinae stick insects. The Ceanothus (left) and Adenostoma (right) ecotypes differ in the presence of a white dorsal stripe, aiding camouflage on their different host plants. Melanistic T. cristinae (centre) are able to hide on the stems of both host plants, serving as a bridge for gene flow between ecotypes. Images courtesy Aaron Comeault (left, right) and Moritz Muschick (centre), reproduced with permission.

Serpentine soil adaptation in Arabidopsis lyrata

Plants locally adapted to serpentine soils must cope with increased heavy metal concentrations and poor nutrient availability. Turner et al. (2010) sequenced whole genomes from A. lyrata plants from serpentine and nonserpentine soils to locate SNPs segregating with soil type, performed a sliding window FST scan to search for signatures of selection on serpentine soils and scanned the genome for copy number variants. Many of the segregating SNPs and outliers corresponded to annotated metal ion transporters from the closely related A. thaliana genome, and serpentine plants appeared to have multiple duplications of regions containing toxin-processing loci. The authors argue for the role of many distinct loci in serpentine soil adaptation in this system.

Adaptation to anthropogenic metal pollution in Perca flavescens (yellow perch)

Yellow perch subject to anthropogenic metal pollution have evolved a rapid life-history strategy to compensate for metal toxicity (Couture & Pyle 2008). Using markers derived from existing whole transcriptomic data, the genome was scanned for outliers in fish from contaminated and clean lakes, resulting in three markers under directional selection. The single nuclear genome marker corresponds to the cyclin G1 gene (tentatively identified via orthologues in four separate fish species), which is strongly associated with metal levels in the fish, contains a SNP at a position highly conserved across vertebrates, and may be associated with increased growth rates, thus arguing for it as a target of selection (Bélanger-Deschênes et al. 2013). Future work characterizing cyclin G1 in yellow perch will help shed light on its role in adaptation to metal pollution.

In all of these examples, functional characterization and validation of candidate genes are yet to be performed.

Challenges of integration

Combining genome-wide approaches with focused molecular adaptation approaches remains a challenge (Fig. 1: Challenge 7), particularly in systems with limited genetic resources and those difficult to rear in controlled conditions. The current dearth of examples and study systems combining these approaches is certainly not due to a lack of interest; rather, in many systems, we are only now beginning to have the resources necessary to do so. Barrett & Hoekstra (2011) suggested an approach for designating an allele as adaptive, with three contributing lines of evidence: phenotype, genotype and fitness. In their Fig. 1, ‘genome-wide scan for signatures of selection’ (fitness and genotype) does not overlap with ‘genetic mapping of phenotypic traits’ (genotype and phenotype)—the separation of molecular adaptation and genome-wide approaches is evident. On the one hand, focused molecular adaptation approaches yield strong data on the genetic underpinnings of phenotypes of interest, but may reflect a bias for the most tractable or ‘interesting’ phenotypes (Rockman 2012; but see Lee et al. 2014). Unbiased genotype–phenotype approaches such as quantitative trait locus (QTL) mapping exist, but establishing the necessary mapping populations is difficult to impossible in many nonmodel systems. On the other hand, genome-wide association studies (GWAS; Visscher et al. 2012) for phenotype–genome association are beginning to be used in some nonmodel systems (Hecht et al. 2013). In particular, techniques developed for working with human genetics may benefit nonmodel systems with similar constraints (e.g. the inability to establish mapping or common garden populations and difficulties in functional verification); by applying these techniques to natural populations, we can overcome some of the environmental and genomic biases inherent in some traditional mapping approaches. With any method, caution should be taken when choosing loci to focus on, as this can suffer from bias (Wayne & McIntyre 2002).

Understanding the molecular functions of candidate genes from either the molecular adaptation or the ecological genomics approach remains a serious problem, particularly as many nonmodel systems lack the extensive genetic and laboratory resources necessary to tackle these issues; however, as with the development of genetic resources, techniques for studying candidate gene functions are continuously improving. Choice of study system plays a large role here, while at the same time limiting research to tractable systems may not reflect the majority of biological reality. A particular issue is that it is unclear to what extent molecular mechanisms are limited to the populations or species under study (Gabaldón & Koonin 2013; Kramer 2015). Fortunately, with recent advances in sequencing techniques, as well as analysis tools developed for work with model system data, the number of amenable systems is increasing and choice can increasingly be made on biological interest rather than tractability (Ekblom & Galindo 2011); in this way, many systems with established natural history data have become ‘emerging model systems’ (Abzhanov et al. 2008). The availability of a larger number of emerging model systems also facilitates the genetic study of repeated evolutionary events, thus allowing discussion of the ‘repeatability of evolution’ (e.g. Colosimo et al. 2005; Rausher 2008; Manceau et al. 2010; Streisfeld & Rausher 2011; Conte et al. 2012). In practice, integrating ecological genomics into molecular adaptation in nonmodel systems will require familiarity with next-generation sequencing and analysis techniques as well as funding for these initiatives, but the ongoing development of accessible genomic analysis tools and sequencing centres has greatly improved the prospects for this integration.

At the same time, it can be difficult to combine experimental results from disparate systems into a conceptual whole (Fig. 1: Challenge 6): differing ecological and evolutionary contexts, genetic and developmental processes and ecosystem function hinder our ability to compare results broadly. As a particular example of this problem, utilizing tools such as homology-based gene identification and gene ontology categorization can be risky in nonmodel systems (van den Berg et al. 2009; Tagu et al. 2014; Kramer 2015), and experimental assessment of function is crucial to fully understand the role of genotypes in adaptation in a given system. A critical part of functional validation is the documentation and reporting of situations in which the results demonstrate divergent functionalities (including loss of function) in nonmodel systems from those predicted using prior data; at present, this reporting likely suffers from a bias against publishing seemingly negative results. Recent reviews have begun to assemble conclusions from a variety of systems (e.g. Hoekstra 2006; Lowry et al. 2008; Martin & Orgogozo 2013; Groot et al. 2016), showing the promise of integrating results across nonmodel systems.

In the emphasis on understanding the genetic and genomic basis of phenotypes, we often fail to integrate ecology and natural history, limiting our conclusions to solely laboratory-based e[x] natura experiments and thus missing their ecological and evolutionary significance (Shimizu et al. 2011). Assessing the fitness or isolation effects of candidate genes in an ecological context in the field (i.e. in natura) also remains a serious challenge, particularly with genetically modified organisms; in some cases, understanding the genetic basis of a phenotype may be combined with manipulative experiments to test selection on that specific phenotype (e.g. Xu et al. 2012b). Different drivers of divergent adaptation (e.g. abiotic and biotic factors) and different speciation contexts (e.g. allopatry vs. sympatry) also will change the significance of a given phenotype and genetic/genomic combination. During early phases of divergence, when selection possibly directly acts on few genes (Feder et al. 2012), the relative importance of these genes for species divergence is increased. Therefore, biochemical and molecular constraints to the evolution at these focal loci may become directly important for the process of speciation by facilitating or constraining divergence. In phenotypes that appear to be under simpler genetic control, integrating ecological genomics may be less critical, although genomic techniques (e.g. genome-wide scans) can help to test whether the phenotype is indeed as genetically simple as hypothesized, and population genomics can inform us as to the complexity of the genetic basis of adaptation.

Outlook

Fortunately, combining ecological speciation genomics with molecular adaptation in a rigorous fashion (Fig. 1: Challenge 7) has become more feasible with the decrease in sequencing costs over time (Ekblom & Galindo 2011). A bidirectional approach may be particularly powerful: using genome-wide scans to identify loci under selection in natural populations and combining this knowledge with more focused molecular knowledge on gene function, or information from linkage and association mapping (Stinchcombe & Hoekstra 2008; Via et al. 2012; Li et al. 2013). In systems not amenable to traditional mapping techniques, there may be the option to harness natural variation using GWAS and similar approaches originally developed for use in human genetics. Ideally, quantitative trait measurement and assessment of fitness should happen in a realistic field environment using translocation experiments where feasible (e.g. Angert & Schemske 2005; Knight et al. 2006; Leinonen et al. 2009). Phylogenetic ancestral reconstruction and model-based approaches may provide some insights into the historical genetic and environmental context of A&S (e.g. Gaucher et al. 2008). The most powerful research in A&S genetics and genomics will combine all relevant factors—phenotype, genotype and fitness consequences, population genomics and evolutionary history—in a single system (Box 1C).

Box 1C. Integrating genetic and genomic adaptation and ecological speciation: case studies

Cryptic coloration in Peromyscus mice

Wild populations of two species of Peromyscus mice display recently evolved variation in pigmentation that correlates with habitat (substrate), with pale mice found on pale substrate and dark mice found on dark substrate; the matching of mouse and substrate colour is under selection by predators (Hoekstra 2006). In beach mice (P. polionotus), Mc1r, the melanocortin-1 receptor gene, is responsible for this difference, with a single SNP displaying a strong genotype–phenotype association in multiple body regions; extensive functional study supports this link, and allele frequencies segregate with the dominant colour phenotype across a large population range (Hoekstra et al. 2006). In the deer mouse (P. maniculatus), a similar environment–phenotype match occurs, but the responsible gene is instead Agouti, the product of which is an inverse agonist of Mc1r. Agouti has multiple SNPs between light and dark mice and displays a loss of nucleotide variation characteristic of a selective sweep (Linnen et al. 2009). Further work in this species using genomic techniques—albeit applied only to SNPs within this locus rather than genomewide—verified the role of Agouti in melanism and has shown that individual SNPs within Agouti are responsible for pigmentation differences in multiple body regions, that these SNPs are under selection, and that the ‘dividing up’ of body regions by SNP allows for minimal pleiotropic effects (Linnen et al. 2013). Alternative transcript processing of Agouti appears to underlie the light–dark transition in these mice, with one untranslated exon specific to the hair cycle showing signatures of positive selection (Mallarino et al. 2016, this issue).

Reproductive isolation in sexually deceptive Ophrys orchids

Ophrys flowers resemble female insects and are pollinated by males searching for mates; pollinator identity is typically species-specific and is the main driver of reproductive isolation between Ophrys species (Xu et al. 2011; Sedeek et al. 2014). A variety of floral phenotypes, particularly hydrocarbons on the floral surface that mimic female insect pheromones, are important for pollinator specificity. Pollinator-relevant species-specific differences in alkene hydrocarbon double-bond position are due to differences in several stearoyl-acyl carrier protein desaturases (SADs), which have been functionally characterized in this system (Schlüter et al. 2011; Sedeek et al. 2016). This complete functional characterization, together with the estimation of fitness effects of different hydrocarbon phenotypes (Xu et al. 2012b), has allowed the modelling of various evolutionary trajectories via changes in SAD function (Xu & Schlüter 2015). Additional candidate genes for other floral phenotypes were also identified via transcriptome and proteome analyses (Sedeek et al. 2013). Genome-wide work found only weak species differentiation, and outlier scans identified only a very small amount of the genome (including candidate hydrocarbon biosynthetic genes) as associated with this differentiation. Together, this suggests that genic, rather than genomic, speciation is at work in this system (Sedeek et al. 2014) (Box Fig.3).

Details are in the caption following the image
Three species of sexually deceptive orchids in the genus Ophrys (left to right, O. garganica, O. exaltata subsp. archipelagi and O. sphegodes). Ophrys orchids mimic the sex pheromones of their bee pollinators, which ensures specificity in pollen transfer and provides an early-acting reproductive barrier. Species divergence appears to be largely genic rather than genomic, in agreement with their position in the early stages of the speciation process. All images by Philipp Schlüter.

Reproductive isolation in the Mimulus aurantiacus species complex

Red-flowered Mimulus aurantiacus ssp. puniceus and yellow-flowered M. aurantiacus ssp. australis are parapatric and are undergoing incipient ecological speciation associated with pollinator-mediated reproductive isolation in the contact zone (Streisfeld & Kohn 2007; Sobel & Streisfeld 2015). Detailed and thorough molecular work identified a single R2R3 MYB transcription factor, MaMyb2, which drives this divergence (Streisfeld & Rausher 2009; Streisfeld et al. 2013). Analysing allelic differentiation and clinal shape at this locus as well as across the genome (Streisfeld & Kohn 2005) demonstrated that MaMyb2 is under selection (Streisfeld et al. 2013). Genome-wide analysis of differentiation argued against allopatric speciation with secondary contact, supporting the parapatric speciation hypothesis and suggesting differentiation only at locally adaptive genomic regions (Stankowski et al. 2015). In this issue, Stankowski et al. (2016) combine clinal analysis and a genome-wide scan for differentiation between the two subspecies, concluding that a set of 130 loci have clines with a similar shape to clines in floral traits, indicating that they—or loci they are linked to—are involved in pollinator isolation. In the hybrid zone between the two subspecies, outlier loci are distributed across the genome, in contrast to the concept of genomic islands of speciation (Stankowski et al. 2016 in this issue).

One potential approach might be as follows: first, for the system in question, identification of what factors to focus on is critical. Although the vast diversity of phenotypic traits is a fundamentally attractive part of biology, understanding which traits underlie adaptation is crucial to understanding how this diversity arose. For example, what are the most significant stages of reproductive isolation, or what environmental factors are driving adaptation (Ramsey et al. 2003; Widmer et al. 2009)? Second, identify which regions of the genome are under divergent selection; this and subsequent mapping steps can use the same set of markers to save costs. It may also be helpful to identify the frequency and pattern of genome-wide single-nucleotide polymorphisms (SNPs), recombination, gene conversion and chromosomal rearrangements for the studied system. Third, consider traditional linkage mapping or GWAS approaches to narrow down the genetic basis of phenotypic traits that seem to be under selection or are involved in reproductive isolation. Fourth, combine the genomic information from the previous two steps to look for overlaps between candidate loci for a given trait and genomic regions under selection (e.g. Via et al. 2012; Li et al. 2013). Finally, where reasonable evidence exists that a specific genomic region is under selection and involved in the phenotype of interest, consider thorough genetic and biochemical characterization of the locus in order to understand the molecular mechanisms of genes that are under divergent selection. Where possible, one should also measure the quantitative fitness consequences of different alleles, or ideally, specific mutations (Barrett & Hoekstra 2011). Other approaches in existing studies include investigating signatures of selection among populations for existing candidate genes (Box 1A–C). If the system allows, using CRISPR (Cong et al. 2013) or other transgenic techniques to identify the putative function and possible fitness effects of these genes would be ideal, but for many systems, this may not be possible. In this case, other options include studying the function of candidate genes in closely related model species or characterizing gene co-expression networks to identify the major functions of genes that are co-expressed with the candidate genes (Higashi & Saito 2013). These approaches will yield an in-depth understanding of ecological A&S genetics.

Conclusion

Adaptation and ecological speciation are dynamic processes that are shaped by ecological conditions as well as the genetic make-up of the organism and its evolutionary history (Nosil & Feder 2012). Understanding the mechanisms underlying these complex processes is challenging and requires an integrative approach. Molecular adaptation studies usually provide an episodic, mechanistic view of how A&S may have occurred, and A&S genomics allows us a view of the evolutionary end point. Combining these two complementary approaches will not only provide a more complete and unbiased understanding of the entire process, but may allow us to compare the mechanisms of A&S among systems and to draw conclusions about the molecular nature of A&S loci and their evolution (Fig. 1: Challenge 7). We believe that research programmes that integrate tools from field ecology, molecular biology, population genomics and individual-based experimentally informed modelling are needed to study A&S (see Lazebnik 2002 for an inspiring read). Systems that have experienced recent speciation and adaptation events—particularly those where adaptation and speciation are presently ongoing—are extremely valuable for understanding how and why A&S occurred, as underlying environmental factors are better known and we may be better able to elucidate the initial processes shaping A&S (Via 2009). For example, organisms that have adapted to recent anthropogenic changes (e.g. agriculture and mining), or speciation events triggered by recent climatic changes such as glacial recession, can serve as extremely valuable systems to systematically investigate A&S. Furthermore, studies of the process of adaptation using both forward genetics and genome-wide approaches in experimentally evolved systems will provide a more complete conceptual understanding of adaptation and ecological speciation. Full knowledge of the genetic substrate upon which selection by the environment ultimately works is crucial if we want to gain a complete understanding of evolutionary change.

Acknowledgements

The authors would like to thank the organizers and participants of the European Society for Evolutionary Biology's 2015 meeting in Lausanne, Switzerland. In particular, we thank the participants of the ESEB symposium ‘The molecular basis of adaptation and ecological speciation’ for stimulating discussion and the Swiss National Science Foundation (SNF) for financial support for this symposium under grant 31CO30_160611. KJRPB is funded by a PLANT FELLOWS Postdoctoral Fellowship (European Union: FP7-PEOPLE-2010-COFUND Proposal No. 267243) and the University of Zurich. SX acknowledges funding by the SNF, grant PEBZP3-142886, a Marie Curie Intra-European Fellowship (IEF), grant 328935, and the Max Planck Society. PMS is funded by the SNF, grant 31003A_155943, and the University of Zurich. The authors also thank Sean Rogers, two anonymous reviewers and Roman Kellenberger for comments on the manuscript as well as Aaron Comeault and Moritz Muschick for kindly providing images of Timema.

    K.J.R.P.B., S.X. and P.M.S. contributed ideas; K.J.R.P.B. and S.X. wrote the manuscript; K.J.R.P.B., S.X. and P.M.S. revised the manuscript.

    Data accessibility

    Not applicable.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.