An omics evolutionary perspective on phytophagous insect–host plant interactions in Anastrepha obliqua: a review
Abstract
Phytophagous insects have a close relationship with their host plants. For this reason, their interactions can lead to important changes in insect population dynamics and evolutionary trajectories. Next generation sequencing (NGS) has provided an opportunity to analyze omics data on a large scale, facilitating the change from a classical genetics approach to a more holistic understanding of the underlying molecular mechanisms of host plant use by insects. Most studies have been carried out on model species in Holarctic and temperate zones. In tropical zones, however, the effects of use of various host plants on evolutionary insect history is less understood. In the current review, we describe how omics methodologies help us to understand phytophagous insect–host plant interactions from an evolutionary perspective, using as example the Neotropical phytophagous insect West Indian fruit fly, Anastrepha obliqua (Macquart) (Diptera: Tephritidae), an economically important fruit crop pest in the Americas. Anastrepha obliqua could adopt a generalist or a specialist lifestyle. We first review the adaptive molecular mechanisms of phytophagous insects to host plants, and then describe the main tools to study phytophagous insect–host plant interactions in the era of omics sciences. The omics approaches will advance the understanding of insect molecular mechanisms and their influence on diversification and evolution. Finally, we discuss the importance of a multidisciplinary approach that integrates the use of omics tools and other, more classical methodologies in evolutionary studies.
INTRODUCTION
Phytophagous insects are widely studied, representing one-fourth of macroscopic species (Winkler et al., 2018). These insects can be catalogued as ecological generalists or specialists, depending on the diversity of their host plants. The condition of generalism or specialism (Box 1) is key to understanding the range of distribution, differentiation, specialization, and/or speciation (i.e., the formation of new species as a result of evolutionary reproductive isolation) (Ehrlich & Raven, 1964). Each host plant represents a particular ecological scenario with physical and chemical defense mechanisms related to herbivory, secondary metabolites, nutrient specificity, and phenology (McPheron et al., 1988).
Phytophagous insect–host plant interactions have been widely studied using multiple approaches, including demographic parameters, behavioral and population genetics, and, most recently, genome architecture (i.e., characteristics of genetic structure and gene organization within the genome) (Frey & Bush, 1990; Feder, 1995; Matsubayashi et al., 2010; Oroño et al., 2013; Knolhoff & Heckel, 2014; Simon et al., 2015; Doellman & Feder, 2019; Gloss et al., 2019; Hardy et al., 2020). Population genetics has helped us to understand the genetic structure of insect populations in relation to the host plants they infest (McPheron et al., 1988). Whereas classical genetics focuses on the study of known genes and the molecular mechanisms associated with these particular genes, new tools can provide a more holistic understanding of the molecular mechanisms that insects utilize to adapt to different host plants. In this sense, omics have provided an important toolbox for studying genetic expression, genomic architecture, and common molecular mechanisms that underlie the capacity of both specialist and generalist phytophagous insects to use host plants (Sexton et al., 2017; Birnbaum & Abbot, 2020).
Omics approaches have been used in various model phytophagous insects to understand adaptive mechanisms to novel host plant species and the effects on specialization and speciation. For example, the pea aphid, Acyrthosiphon pisum (Harris), has been one of the most relevant models for understanding ecological speciation – i.e., speciation due to a new ecological factor producing a selective pressure that causes the evolution of reproductive isolation – in phytophagous insects, as polymorphisms found in its genome and transcriptome allowed understanding of the genetic bases for its specialization behavior (Peccoud et al., 2009; Eyres et al., 2016; Nouhaud et al., 2018). For example, using sequencing of RNA (RNA-seq) (Box 2) and reciprocal transplant experiments, changes in the genetic expression of chemical-sensorial and salivary glands in response to the host plant were identified (Eyres et al., 2016). Furthermore, differentiation hotspots along the genome in three host races of this species were established using whole genome sequencing (WGS) from pooled samples (pool-seq) (Box 2), providing gene candidates under selection where the divergence among populations infesting various host plants could be established (Nouhaud et al., 2018).
Some species of the Tephritidae family, known as true fruit flies, have been used to study generalism and specialism. Different speciation scenarios have been reported for species of this family, such as divergence due to geographic isolation (i.e., divergence by allopatry, which occurs when two populations accumulate differences and develop reproductive isolation barriers as a result of geographic isolation) (Ottens et al., 2017; Winkler et al., 2018) and adaptive radiation (i.e., rapid speciation due to the presence of new ecological niches) in response to host change (Bush, 1969; Rull et al., 2006), as well as the effect of these scenarios on the genomic architecture of the species (Egan et al., 2015). The majority of these studies used organisms from Holarctic and temperate zones (Figure S1), where we find that the response to host change plays an important role in adaptive radiation. However, in tropical regions, the effects of changing from one host species to another, and use of specific host plants by these insects, have not been studied as often (Figure S1). Thus, in this review, we use as example an organism from the tropics, the West Indian fruit fly, Anastrepha obliqua (Macquart) (Diptera: Tephritidae), that we think could become a good model for future research on adaptive mechanisms of phytophagous insect–host plant interactions. We therefore reviewed the molecular mechanisms and omics tools that have been described in the adaptation of this phytophagous insect to its host plants.
Anastrepha obliqua as a study organism
The definition of model organism has grown along with the increase in genome sequencing research. At first, model organisms were limited to species with characteristics that facilitated laboratory experiments, such as size and short generational cycle. However, model organisms nowadays include other species based on criteria such as agricultural importance, economic impact, or effects on human health (Hedges, 2002). Matz (2018) recently published an article on the relevance of genomic approaches with obscure model organisms (OMOs). These are defined as organisms for which no previous genome resource exists, but that constitute potentially good models to study genomic evolution, acclimatization, and adaptation in real ecological contexts. Although A. obliqua is not yet considered a model organism, it could be included as an OMO to study the adaptation mechanisms of phytophagous insects to host plants in the Neotropics, due to its economic impact on fruit crops and its status as an important pest.
Anastrepha obliqua is a species with a wide distribution, extending from Mexico to Brazil, including the Antilles. It is also occasionally reported in the USA (Fu et al., 2014). It is the most important mango (Mangifera indica L.) fruit fly pest in Latin America (Aluja et al., 2014; Fu et al., 2014; Rull et al., 2018), but has also been catalogued as a highly polyphagous species, attacking 24 fruit families (Carrejo & González, 1994). The species has an important economic impact on fruit crops (CONPES, 2008; Norrbom, 2008). No less than 30–40% of losses in fruit production in Colombia is related to fruit flies; 70% if adequate pest management is not in place (CONPES, 2008). Mexico uses sterile insect technique (SIT) programs to control populations of A. obliqua in some areas of its distribution (Rull et al., 2018), and Brazil has regulatory laws to quarantine this species (Passos et al., 2018). Being a highly mobile, opportunistic, multivoltine species, A. obliqua is thought to have great dispersive capacity and invasive potential for maintaining populations in new hosts and/or distribution ranges (Tejeda et al., 2016) that can influence diversification patterns.
As an important fruit pest, A. obliqua has been studied using various approaches to understand its biology. Anastrepha obliqua is in the fraterculus species group, and it shares morphological and behavioral characteristics with closely related species, resulting in the formation of natural hybrids (Scally et al., 2016). This fact is relevant for the diversification process, management strategies, and the interspecific relationships of A. obliqua. Research includes studies of the genetic population structure in some areas of its distribution (Ruiz-Arce et al., 2012, 2019; Aguirre-Ramirez et al., 2017; Passos et al., 2018); molecular mechanisms and phenotypic plasticity (i.e., the ability of one genotype to produce different phenotypes under different environmental conditions) (Velasco-Cuervo et al., 2021; Lemos-Lucumi et al., 2022); morphometric characteristics (Castañeda et al., 2015a); reproductive behavior (Aluja et al., 2001; Leal & Zucoloto, 2008); chemical ecology (Cruz-López et al., 2006; Aluja et al., 2020; De Aquino et al., 2021); relationships with microbiota (Gallo-Franco & Toro-Perea, 2020; Amores et al., 2021; Cárdenas-Hernández et al., 2022); and phylogenetic relationships (Smith-Caldas et al., 2001; Ruiz-Arce et al., 2012). These studies have revealed a marked genetic structure (Ruiz-Arce et al., 2012; Aguirre-Ramirez et al., 2017) and morphological differentiation (Castañeda et al., 2015) between populations of A. obliqua throughout its distribution range, and the importance of host plants in its genetic makeup and intra- and interspecific relationships.
The importance of A. obliqua interactions with various host plants has been documented, demonstrating host effects on various characteristics of the insect's life history (Table 1). For example, the composition of the aroma secreted by sexually mature adult males is influenced by larval development in specific fruit species (Aluja et al., 2020). These changes affect the interaction between males and females and could lead to mating preferences and host fidelity, bearing consequences on speciation. Aguirre-Ramirez et al. (2017) investigated the population structure and genetic diversity of A. obliqua in the southwest of Colombia and found different haplotype frequencies among flies belonging to different host plants. Two mitochondrial DNA markers, the subunit I of the cytochrome c oxidase (COI) gene, and the NADH dehydrogenase subunit 6 (ND6) gene, revealed a total of seven haplotypes, some of which exclusive to particular localities or host plants. Ruiz-Arce et al. (2019) found a lack of genetic structure associated with the host plant in populations of A. obliqua in Veracruz (Mexico); however, some populations did show genetic differentiation, which could be attributed to some degree to host fidelity. More recently, studies carried out on A. obliqua flies infesting red mombin (Spondias purpurea L.), mango, and carambola (Averrhoa carambola L.) have shown differences in the expression of genes associated with digestion and detoxification (Velasco-Cuervo et al., 2021) and in microRNAs (miRNAs) that have as target mRNA genes in these same categories (Lemos-Lucumi et al., 2022). In both studies, the main differences were found in flies infesting carambola, which could be related to the chemical composition of these fruits and their recent introduction to the American continent (Velasco-Cuervo et al., 2021). Aguirre-Ramirez et al. (2021) found similar results at the genomic level. Populations of A. obliqua infesting carambola have a greater genomic difference [based on restriction-site associated DNA sequencing (RAD-seq; Box 2] compared to populations infesting red mombin and mango. Additionally, 54 single nucleotide polymorphisms (SNPs) were found to be associated with host plant use, some of which in protein families that could be annotated in pathways related to nutrition (Aguirre-Ramirez et al., 2021).
Objective | Results | Reference |
---|---|---|
Document life-history parameters at different stages | Slight differences in timing of larval and fruit development among individuals that infest mango (Mangifera indica) and red mombin (Spondias purpurea) | Celedonio-Hurtado et al. (1988) |
Analyze habitat use in a heterogeneous environment where mango and red mombin are found | Behavioral differences associated with the host plant; e.g., in % females observed for specific activities (resting, feeding, reproduction, and egg laying) in specific microhabitats (mango and red mombin) | Aluja & Birke (1993) |
Evaluate the effect of mango varieties on various biological parameters such as life cycle, pupal weight, larval viability, and adult survival | Differences among mango cultivars were detected in terms of the degree of infestation, pupal weight, and adult longevity | Carvalho et al. (1996) |
Evaluate host preference [mango or orange (Citrus spp.)] | Slight preference for mango as a landing site when mango and orange are available | Camargo et al. (1996) |
Determine factors that influence ovary development | Presence of the host and its volatiles are important for ovary maturation | Aluja et al. (2001) |
Explore effects of the host larvae on various reproductive factors such as mating probability and duration and intervals of copulation | Host can influence copulation time and male sexual performance, especially if this interacts with adult male diet | Perez-Staples et al. (2008) |
Characterize diversity and structure associated with differences between ecologically differentiated geographic regions and host plants | Differences in haplotype frequency distribution associated with the host plant | Aguirre-Ramirez et al. (2017) |
Examine whether life-history characteristics (propensity to mate, competitiveness, larval and adult weight, and demographic parameters) vary among mango cultivars | No significant differences in propensity and competitiveness. Locality affected the longevity and oviposition period | Hernández et al. (2019) |
Evaluate the effect of host quality on the composition of aromas produced by males | Host plant of larvae influences the aroma of sexually mature males in terms of quantity of odor, compound mixtures, and concentration of some components of the mixture | Aluja et al. (2020) |
Evaluate the bacterial community structure associated with life stage (larva and adult) and host plant (red mombin, mango, and carambola) using metabarcoding (16S rRNA) | Differences in the structure and abundance of the operational taxonomic units (OTUs) according to the host plant. Samples collected in carambola showed greater variability when compared to mango and red mombin | Gallo-Franco & Toro-Perea (2020) |
Determine gene differences among fruit flies infesting mango, red mombin, and carambola in sympatry | Host plants have an effect on genetic differentiation among populations. Differentiation is, nevertheless, incipient and there is no evidence of host races. | Aguirre-Ramirez et al. (2021) |
Evaluate differential gene expression patterns among fruit fly larvae infesting red mombin, mango, and carambola under sympatric and synchronous conditions | Differential gene expression among flies infesting the three host plants, mainly in genes related to resource use such as digestion and detoxification genes. The main differences were found between red mombin (native host) and carambola (introduced host) | Velasco-Cuervo et al. (2021) |
Compare the microbiota and microbiome of larvae infesting red mombin, mango, and carambola using whole metagenome sequencing | The composition of the microbiota and microbiome showed differences between flies infesting the three host plant species | Cárdenas-Hernández et al. (2022) |
Obtain the first micro-transcriptome and determine its expression levels and possible target mRNAs when the larvae feed on red mombin, mango, and carambola | A total of 21 microRNAs (miRNAs) were found to be differentially expressed among the flies infesting the three host plant species, with the carambola flies showing the largest differences. The annotation of the targets showed that these interfering RNAs target important genes that play a role in development, feeding, and detoxification | Lemos-Lucumi et al. (2022) |
The polyphagous insect status of A. obliqua, its importance as a species of economic impact in the Neotropics, and the evidence of the effect that host plants have on different aspects of its life history, make this species a good case study to understand the molecular mechanisms involved in the survival and/or diversification of Neotropical phytophagous insects associated with their host in sympatric conditions (i.e., when two populations or species have overlapping geographic distribution), and on a small geographic scale.
Adaptive molecular mechanisms
Host change in phytophagous insects can result in two scenarios, in which the change promotes (1) generalism by increasing its distribution range, or (2) differentiation in each or some of the hosts, and helps in specialization. This second scenario can even result in the generation of host races (i.e., populations that are partially reproductively isolated as a consequence of local adaptation to their hosts) or new species (Heard, 2012). Two non-mutually exclusive processes form the base for these scenarios: phenotypic plasticity and local adaptation through ecological specialization – the latter occurs when a population adapts to the specific conditions of its native environment and demonstrates higher fitness than individuals of the same species coming from other populations (Orsucci et al., 2018). A study of the molecular mechanisms of these two processes by which these insects adapt to different host plants should be undertaken.
Host plant-driven selection is generally associated with a plastic response. Some studies have associated differential gene expression (DGE) to herbivory in insects; DGE includes host-specific detoxification and digestion genes (De La Paz Celorio-Mancera et al., 2013; Ragland et al., 2015; Christodoulides, 2017; Etges, 2019; Pym et al., 2019; Velasco-Cuervo et al., 2021; Lemos-Lucumi et al., 2022). Comparative genomics has also helped in identifying differences in genomic architecture between poly- and monophagous insects. For example, some authors have found differences in the number of gene duplications for genes associated to chemical perception and detoxification (involved in resource use) (Christodoulides, 2017; Gouin et al., 2017). Other genomic patterns such as differences in allelic frequencies are usually found between species (e.g., poly- vs. monophagous) and also among populations of a single species that infests different hosts (Oroño et al., 2013). The differences that we can trace at the genome or transcriptome level will depend on factors such as gene flow (i.e., allelic movement) among populations, and the time at which the population continuum is sampled.
In a model such as A. obliqua, where populations infest different host plants in sympatry, it is assumed that gene flow will continue to occur among populations (Figure 1). Initially, differentiation is expected to occur on one locus or a few loci because the populations could have strong genetic similarity at this stage (Campbell et al., 2018). However, differentiation in these few loci could cause some degree of reproductive isolation. It is still strongly debated whether differentiation in this ecological context occurs due to many loci with a small effect or a few loci with a large effect (Feder et al., 2012). Even so, it has been predicted that differentiation could begin in loci directly associated with ecological differentiation, having an effect on particular traits of the phenotype. These characteristics can be subject to divergent selection – i.e., selection between two populations acts in opposite directions – and are generally related to a better use of the ecological resource or to non-random mating (Servedio et al., 2011).

Although selection can occur on loci undergoing fixation in the population, the process can also alter allelic frequencies of loci in neighboring or connected regions by what is known as hitchhiking (Smith & Haigh, 2008). These differentiated loci become a barrier to gene flow (barrier loci), causing local adaptation and ecological specialization in the presence of gene flow.
In a species such as A. obliqua, we would expect differentiation among populations that infest different host plants in sympatry. This would be reflected in FST (fixation index, a measure of population differentiation developed from Wright's F-statistics) peaks through the genome in loci that are related to characteristics under divergent selection associated with host use [which could be identifiable using methodologies such as quantitative trait locus (QTL) analysis and genome-wide association studies (GWAS); Box 2] (Caillaud & Via, 2012; Ragland et al., 2017; Doellman et al., 2018). In a more advanced phase, we would expect more homogeneous differentiation throughout the genome, and that not only the loci linked to regions under divergent selection increase their allelic frequency, but that poorly adapted genes are also being eliminated (Feder et al., 2012).
This differentiation continuum corresponds to a four-phase speciation model with gene flow. In phases 1 and 2, strong selection over the selected loci is expected (a major contribution of divergent hitchhiking, which occurs when allelic frequencies surrounding barrier loci change as result of physical linkage and a selective sweep), whereas in phases 3 and 4, a barrier to gene flow is expected throughout the entire genome (a major contribution of genomic hitchhiking, which occurs when allelic frequencies change at the genome level as a result of a reduction in migration rates, including regions not linked physically) (Feder et al., 2012). Although the identification of these phases in natural populations can be challenging, genome patterns from which these phases can be inferred have been proposed. For example, Turner et al. (2005) had previously suggested genome differentiation patterns using a geographical metaphor corresponding to divergent genomic islands, and Feder et al. (2012) argued divergent genomic continents referring to the degree of differentiation that we can find through the genome. Nevertheless, these patterns in the genomic landscape can also originate from other evolutionary forces in a variety of geographic contexts (Wolf & Ellegren, 2017), so other methodologies and experiments must be integrated in order to correlate genetic variation with a phenotypic trait.
To verify molecular fingerprints of generalism or specialism, various tools, allowing exploration of differences at genome, transcriptome, proteome, and metabolome levels, can be used. In this way, we can observe changes reflected in allelic frequencies, FST peaks, mRNA levels, or miRNA gene expression, or differences in methylation patterns.
Tools for an evolutionary approach in the omics era
Although the contribution of classical genetics to the comprehension of basic cell processes and development goes without saying, next generation sequencing (NGS) changed the approach to understanding molecular mechanisms on large scale. NGS comprises a set of rapid and cost-effective platforms or technologies to generate thousands to many millions of sequencing reactions per run using DNA or RNA, to produce high-throughput sequencing data (van Dijk et al., 2018; Tiwary et al., 2021). NGS technologies have played an important role in many fields of research, including evolutionary biology, and enabled the development of new research questions by changing the way we think and carry out ecological and evolutionary studies, such as incorporating non-model organisms based on natural systems (Tautz et al., 2010).
Does genomic differentiation exist among insect populations that feed on various host plants?
When we approach phytophagous insect–host plant interactions for an organism such as A. obliqua, for which genomic information is scarce but there is some knowledge of population structure, it is necessary to study the biology of the species. Information on life history, behavior, and reproduction will help understand the possible influence of genomic differentiation in the insects at a given time during their interaction with host plants.
To track genomic differentiation and divergence patterns, techniques such as RAD-seq, WGS, pool-seq, and RNA-seq (Figure 2, Box 2) can be used. For example, DNA markers associated with restriction sites (RAD-seq) have been one of the most widely used methodologies to evaluate genomic differentiation, ecological specialization, and local adaptation (Catchen et al., 2013; Keller et al., 2013; Egan et al., 2015; Guo et al., 2016a; Marques et al., 2016; Picq et al., 2016; Koubínová et al., 2017). This methodology produces thousands of SNPs throughout the entire genome. RAD-seq has been used to characterize genomic variation in populations with gene flow and to identify genomic markers of differentiation or selection associated with different environmental parameters in non-model organisms within Lepidoptera (Nadeau et al., 2014) and fruit flies (Egan et al., 2015; Aguirre-Ramirez et al., 2021) (for more information on the family of RAD-seq methodologies, see Andrews et al., 2016). On the other hand, WGS has two widely used approaches: de novo sequencing and re-sequencing (WGR). A great advantage of WGS over other methodologies is that it will provide a complete genome assembly for the study subject, thereby providing a genomic resource that can be used for comparison with other individuals. WGS allows comparison of both genome content and architecture. Gouin et al. (2017) used WGS to obtain the complete genome of a highly polyphagous Lepidoptera species, Spodoptera frugiperda J.E. Smith (fall armyworm), and found an expansion (through gene duplication) of genes related to chemoreception, detoxification, and digestion, when compared to specialist Lepidoptera species. Findings suggested gene duplication is a possible adaptive mechanism for polyphagy (Gouin et al., 2017).

An alternative to maximizing resource use is to pool samples (pool-seq). Pool-seq provides a better cost–benefit strategy compared to approaches in which samples are analyzed or sequenced individually (Schlötterer et al., 2014), especially when the research question is population-based and aimed to determine genomic variation patterns. Pool-seq has been used in variant filtering (Anand et al., 2016) and for association mapping of non-model organisms (Micheletti & Narum, 2018). The use of this methodology implies the loss of individual genotypes, which must be taken into account before conducting the study (for more information on pool-seq see Schlötterer et al., 2014). When pooling samples, it is important to keep the original individual DNA samples so that variations can be demultiplexed in case interesting variation is detected. Gautier et al. (2013) directly estimated differences in allelic frequencies of genotyped individual vs. pooled samples and found that pool-seq is a precise and suitable strategy for comparing genomic patterns of genetic variation.
How does gene expression and regulation relate to the use of different host plants by phytophagous insects?
DNA polymorphisms and gene expression changes in insects are often related to responses to physical and chemical host plant defenses. RNA-seq, which contributes to the understanding of phenotypical and behavioral plasticity in wild populations, is one of the most popular technologies for approaching questions in ecological and evolutionary research (Todd et al., 2016). Up- and downregulation of genes can be related to differences in population responses to various host plants as part of a plastic response.
Differential gene expression related to detoxification or digestion, for example, could explain how polyphagous insects are able to survive in different environments. These differentially expressed genes (that can be related to traits associated with resource use) are candidates for the study of phenotypic plasticity through reciprocal transplant experiments and potential ecological differentiation among populations. RNA-seq has the advantage that it can be used with or without a reference genome, so this tool has been implemented with a large number of organisms. For example, RNA-seq was used with the comma butterfly, Polygonia c-album (L.), to measure its gene expression on various hosts (De La Paz Celorio-Mancera et al., 2013). The results showed that plasticity is responsible for the ability of insects to feed on new hosts, thus explaining the generalist behavior of the species (De La Paz Celorio-Mancera et al., 2013). A difference in the expression of genes associated with digestion and detoxification has been found when A. obliqua flies infest different host species (Velasco-Cuervo et al., 2021).
Additionally, RNA-seq provides the nucleotide sequences of the expressed transcripts available, and therefore SNPs can also be detected. Differences in expression can be regulated through variation in the gene coding sequence, promoters, DNA structure, or trans-regulating genes, as well as by means of epigenetic mechanisms. Further analyses are usually necessary in order to associate the polymorphisms found in the transcripts with phenotypic changes. However, differentially expressed genes that are putatively related to resource use or reproduction can be used in ecological specialization studies. These candidate genes could help to understand how insects are able to respond to the physical and chemical conditions of host plants. Yet, understanding genetic mechanisms associated with host change in generalists vs. specialists also requires knowing the transcriptome, for example, for each of the host-races in different hosts (Etges, 2019).
Other genomic tools can help determine the specific role of certain genes along the differentiation continuum. In the targeted capture method (Box 2), we use previous knowledge of our study organism, or a sister species, to design probes to determine the differential expression of previously studied genes. In the case of A. obliqua, this methodology could be implemented based on knowledge of candidate genes associated with host use. A good example of how targeted capture can be used for understanding host-race adaptation to a new environment is proposed in a study by Eyres et al. (2016), where the role of chemo-sensorial genes in host dependent pea aphid races was analyzed. Genes associated with olfactory and taste receptors were highly differentiated, presenting a larger proportion of SNPs. These genes are good candidates for explaining local adaptation in these aphids.
Differences in gene expression can largely explain the ability of both generalist and specialist phytophagous insects to infest different host plants (Birnbaum & Abbot, 2020). Over the past years, however, multiple mechanisms involved in the regulation of genetic expression that may be responsible for, or intermediaries in, the plastic response of insects, have been characterized. Among those regulatory mechanisms, the role of some small RNAs such as miRNA in post-transcriptional regulation has been characterized. It has been proposed that the main role of miRNA is gene repression through a mechanism of mRNA destabilization or translation inhibition (Cora et al., 2017). Due to the many processes where miRNAs seem to be involved (Cai et al., 2009), some researchers propose that miRNA plays an important role in the adaptation of phytophagous insects to the host plants that they infest (Moné et al., 2018). In a study of S. fugiperda, diet-responsive miRNAs were found to be related to the control of metabolic routes necessary for homeostasis during phytophagous insect–host plant interaction (Moné et al., 2018). Recently, differences in the expression of miRNAs were found in A. obliqua when infesting different host species (Lemos-Lucumi et al., 2022). The target genes of these miRNAs were associated with digestion processes such as carbohydrate metabolism and detoxification, so they could be modulating the interaction of the flies with their hosts (Lemos-Lucumi et al., 2022).
Other ways to explore the mechanisms of gene expression regulation is by determining genome methylation patterns. Although the effects of epigenetic changes on ecological specialization have not been explored, it is considered to be one of the mechanisms involved in this process (Loxdale & Harvey, 2016). Using bisulfite sequencing (BS-seq) (Box 2), differences in methylation patterns could be related to changes in transcriptional profile in different castes of the honey bee, Apis mellifera L., with different diets (Foret et al., 2012). This is, without doubt, a field that requires exploration.
How to identify regions of genomic divergence underlying phenotypic traits?
Association studies correlate gene candidates with phenotype characteristics by applying analysis such as GWAS and QTLs. QTL mapping has allowed correlating specific loci with the variation of quantitative characteristics in the phenotype. Via et al. (2012) used two host races of the pea aphid as model system, found highly differentiated genomic regions (FST outliers) between them, and associated characteristics causing ecological isolation with a reproductive base (host choice and fertility). They concluded this constituted a good model system for incipient ecological speciation with gene flow (Via et al., 2012). On the other hand, through GWAS, Ragland et al. (2017) analyzed the relationship of two phenotypic traits (initial diapause intensity and diapause termination time) with SNPs and linkage groups of two host races of apple maggot, Rhagoletis pomonella (Walsh), using genotyping-by-sequencing (GBS). The study found evidence of highly differentiated genomic regions between the host races affecting both traits and/or only one trait. This suggests adaptive divergence of this species to different hosts (Ragland et al., 2017). More recently, GWAS using gene expression – expression quantitative trait loci (eQTLs) and transcriptome-wide association studies (TWAS) – seemed promising in the search for genomic regions that contribute to divergent gene expression and its effects on ecological specialization (Nosil, 2012; Eyres et al., 2016; Wainberg et al., 2019). Although there have been no studies of this kind on phytophagous insect–host plant interactions, these approaches could help to understand the role of regulating elements in the variation of gene expression levels at the genomic architecture level (e.g., the prevalence of cis and trans elements) and the heritability of this variation (Gilad et al., 2008).
Clustered regularly interspaced short palindromic repeats-associated protein Cas9 technology (CRISPR/Cas9) (Box 2) for genome editing (Hwang et al., 2013) has been implemented in association studies and in the validation of target gene function. CRISPR/Cas9 has been an important tool in the study of local adaptation and speciation over the last few years (Bono et al., 2015). Gene knockouts are a classic tool for confirming an association between genotype and phenotype (Bono et al., 2015). Other applications, such as the creation of structural modifications (Kraft et al., 2015), transgenic design (Chen et al., 2020), and regulation of gene expression (Gilbert et al., 2013), can be used to examine phenotype variation and its effect on ecological and evolutionary processes such as adaptation (Bono et al., 2015). For example, Ravinet et al. (2017) have emphasized the importance of this methodology in the identification of gene flow barriers – one of the key aspects in ecological specialization.
Although still a tool with limitations in non-model organisms, CRISPR/Cas9 tests have been carried out in insects, including some Anastrepha species. For example, the technique can be used to determine whether germ lines can be manipulated to develop population control programs through sterilization (Li & Handler, 2019). Some mechanisms of phytophagous insect–host plant interactions have also been studied with gene editing. Serine protease-type proteins play an important digestive role in phytophagous insects, and the regulation of the expression of these genes modulates the response to the protease inhibitors produced as defense mechanisms by host plants (Vidal et al., 2019; Wang et al., 2020). The Old World bollworm, Helicoverpa armigera Hübner, a polyphagous and economically important crop pest, has a great diversity of serine protease-type proteins with potential to adapt to their host plants (Wang et al., 2020). To study the effect of trypsin-type proteins, specifically serine protease-type proteins, on genetic expression in H. armigera, Wang et al. (2020) used CRISPR/Cas9 to silence a cluster of 18 genes that code for trypsin-type proteins. The result was an overexpression of other trypsin and chemo-trypsin genes and defense genes. The overexpression of these genes compensates for the knockout of the other genes and allows maintaining a functional insect metabolism during its interaction with host plant protease inhibitors (Wang et al., 2020). This is a good example of how CRISPR/Cas9 may help understand the molecular mechanisms that phytophagous insects use to adapt to their host plants. In the case of A. obliqua, for which few genomic resources are available, CRISPR/Cas9 could be used to examine the association between specific genes with known function and specific phenotypic traits involved in the use of host plants.
How do microbial communities contribute to the use of host plants by phytophagous insects?
Several studies have shown the significant role of the microbiota in the use of host plants by phytophagous insects. The microbiota includes bacteria, archaea, and fungi, groups of organisms that can contribute to nutrition, digestion, metabolism, and detoxification in these insects. Various approaches – such as metabarcoding, metagenomics, and metatranscriptomics – provide insight into the diversity, abundance, and function of the microbiota in insects. Ventura et al. (2018) and Gallo-Franco & Toro-Perea (2020) investigated bacterial diversity and abundance in larval and adult fruit flies of the genus Anastrepha using metabarcoding of the 16S rRNA gene. Ventura et al. (2018) compared four species of Anastrepha and found that bacterial diversity is higher in species with broad diets, and is also higher in adults relative to larvae. On the other hand, Gallo-Franco & Toro-Perea (2020) compared populations of A. obliqua flies infesting three species of host plants and found differences in diversity and abundance in one of the fruits (carambola), so the microbiota could be playing an important role in the establishment of this species on its host plants. Cárdenas-Hernández et al. (2022) conducted a study at the metagenomic level, where they were able to reconstruct some genomes of bacteria associated with A. obliqua and identify differences in the microbiota and microbiome when this fly infests different host plants. Some genera of bacteria and yeasts appear to be predominant in populations infesting different hosts, which could be influencing the ability of A. obliqua larvae to feed and survive in each host plant species.
Knowledge about the possible role of the microbiota in A. obliqua has accumulated in recent years (Ventura et al., 2018; Gallo-Franco & Toro-Perea, 2020; Amores et al., 2022; Cárdenas-Hernández et al., 2022). The next step is to establish a more direct relationship between the microbiome and resource use in this fruit fly. Metatranscriptomic studies will allow to undertake this approach. Establishing the genes expressed by the microbial community of fruit flies infesting different host plant species and their activity or function within the metabolism of the flies will elucidate the influence of these microbial communities.
The need for multidisciplinary approaches
The answers to research questions regarding the study of phytophagous insect–host plant interactions ought to be found through an interdisciplinary approach. Omics technologies provide a large amount of information to produce robust conclusions on genomic differentiation, and the relationship between the phenotype and regions of the genome under selection. These technologies should be used together with other methodologies such as selection experiments, reciprocal transplants, fitness measurements, and experimental evolutionary studies that can help understand phytophagous insect–host plant interactions holistically. For example, selection experiments contribute to elucidating adaptation and the genetic basis of groups with adaptive traits (Fuller et al., 2005). When combined with other genomic tools, selection experiments have also helped determine which specific regions, genes, and/or SNPs seem to be affected by selection (Doellman et al., 2018).
Reciprocal transplant experiments and fitness measurements can also provide strong evidence for divergent selection. These experiments consist of the evaluation of organism fitness under various controlled environments. Together with omics approaches, they can provide evidence of regions under which phenotypic traits are associated with phytophagous insect adaptation to the host plant. We propose a methodology based on reciprocal transplants, fitness measures, and omics technologies to understand the molecular mechanisms in insects associated with host use, for example A. obliqua (Figure 3). In the first stage, a reciprocal transplant experiment is carried out with two A. obliqua populations colonizing mango and carambola. First instars of either population can be collected and transplanted to the opposite fruit host (Figure 3A). This type of experiment allows various analyses, such as fitness measurements in terms of number of larvae that survive in reciprocal transplants vs. controlled conditions (Figure 3A). The experiments also allow understanding direct local adaptation in order to compare native to non-native population fitness (Savolainen et al., 2013). The expected result, with local adaptation, is that the population performance of A. obliqua coming from mango would be greater in mango than in carambola. This could be tested using reaction norms (Figure 3B).

In a second stage, differential gene expression can be measured or selected SNPs can be detected (e.g., at population level) after the populations have been submitted to the contrasting environments. This step requires the implementation of omics tools and can provide information, for example, on gene expression for a population or a species in the native vs. the new environment. In this manner, (1) the extent of the plastic response, (2) the genes directly involved in survival within the host, and (3) whether selection has occurred, can be determined. Reduction of the original plastic response can be determined (e.g., in one of the populations), thereby contributing to knowledge of the genetic bases for ecological specialization. At a final stage, methodologies such as GWAS and/or CRISPR/Cas9 can be included. For example, once divergent gene candidates or divergent genomic regions are identified by host, CRISPR/Cas9 could be used to silence these genes and validate their function and effect on the phenotype. This workflow would answer holistic questions from an evolutionary approach, thus uncovering the mechanisms by which insects are able to use and adapt to their host plants.
Reciprocal transplant experiments are popular in phytophagous insect–host plant studies. For example, preference and performance for native vs. alternative hosts have been evaluated in economically impacting borer moths, Ostrinia nubilalis Hübner and Ostrinia scapulalis Walker (Orsucci et al., 2016). In these experiments, preferences for native hosts were found and the importance of these traits as barriers to reproductive isolation in divergence processes were discussed (Orsucci et al., 2016). Other studies have shown the effect of diet on the colonization of new hosts. For example, in cotton aphid, Aphis gossypii Glover, garden experiments and reciprocal transplants demonstrated that host range expansion can be measured by individual feeding experience and, in some cases, by the presence of specific genotypes (genetic background) (Ma et al., 2019). In the family Tephritidae (Ragland et al., 2015), larval performance and gene expression in Rhagoletis zephyria Snow and R. pomonella were evaluated by alternating host fruits. The results showed high plasticity differential gene expression and a greater performance in species survival in its native host.
Multidisciplinary approaches, together with omics approaches, provide an invaluable platform for corroborating phenotype association, local adaptation, or manipulation of model system evolution under controlled conditions (De Villemereuil et al., 2016). A multidisciplinary approach contributes to knowledge of the genetic basis and molecular mechanisms in phytophagous insect–host plant interactions.
CONCLUSIONS
In this review, we established how various omics tools can contribute to answering evolutionary questions in the context of phytophagous insect–host plant interactions. We used A. obliqua as a model for the study of these interactions in the Neotropics. Omics provide large data sets to study resource use in insects and the evolutionary implications of these interactions. The methodologies we use to study the ecology and evolution of phytophagous insects is crucial to understanding how changing hosts and the use of diverse hosts influence range expansion in insects and their ecological specialization. A holistic approach requires combining omics tools with other types of experiments; for example, common garden experiments, reciprocal transplants, fitness measurements, and/or experimental evolution assays that allow the association of the molecular mechanisms of interest with traits that contribute to the adaptation of insects to their host plants. Finally, the application of these methodologies to Neotropical species provides new knowledge on the influence of host plants on phytophagous insect population dynamics in a geographical area where information is still lacking. This could further reveal important diversification strategies that direct the evolutionary trajectories of these species.
AUTHOR CONTRIBUTIONS
Sandra Marcela Velasco Cuervo: Conceptualization (equal); investigation (lead); writing – original draft (lead). Leonardo Galindo-González: Conceptualization (equal); investigation (equal); writing – review and editing (equal). Nelson Toro Perea: Conceptualization (equal); investigation (supporting); writing – review and editing (supporting).
ACKNOWLEDGMENTS
We acknowledge the members of the Eco-Genetics and Molecular Biology Studies Group, especially Elkin Julian Aguirre Ramirez and Dr. Heiber Cardenas Henao for discussions regarding ideas related to the study of phytophagous insects. To the Universidad del Valle and MinCiencias-Colombia for funding studies related to this research topic [C.I. 71115, Genómica funcional de larvas de la mosca de la fruta Anastrepha obliqua (Diptera: Tephritidae) asociada a la alimentación en ciruela, mango y carambolo]. Thanks to Global Affairs Canada, University of Alberta, and Dr. Stephen Strelkov, who financed part of the research through the Emerging Leaders in the Americas Program (ELAP). Finally, thanks to Jorge Mario Ruiz Idarraga who helped organize this idea.
Additional Supporting Information may be found in the online version of this article.
Box 1. Generalists vs. specialists
Generalist herbivorous insects feed or oviposit on two or more host plant families (Clarke, 2016). Generalism in herbivorous insects is considered rare, with only an estimated 2% of the insects adopting this lifestyle (Clarke, 2016). Clarke et al. (2016) have suggested that generalism is driven by unique sets of biological attributes restricted to one lineage, which would explain why for some lineages generalism is more frequent.
Specialist herbivorous insects feed or oviposit on only one or a few host plants, which are generally phylogenetically related (Clarke, 2016). In these insects, many factors may contribute to ecological specialization or local adaptation, including adult fidelity to the natal host plant, specificity in responding to the physical and chemical conditions of the host plant they infest, and the close relationship between insect life cycle and host plant phenology.
Generalists, perhaps less specifically adapted to the characteristics of their host plants, have a more evolvable diet, which implies that associations with new host plants may evolve faster into generalism, as evidenced by comparative phylogenetic studies (Hardy & Otto, 2014; Hardy et al., 2020). The differences between generalist and specialist herbivorous insects are not restrcited to the breadth of their diets. Hardy et al. (2020) consider four types of potential differences that could affect host-use adaptation: genetic architecture, phenotypic plasticity, population size and structure, and community interactions (with host plants, natural enemies, and endosymbionts).
Box 2. Omics strategies used to study phytophagous insect–host plant interactions
RAD-seq (restriction site-associated DNA sequencing)
DNA is cut into fragments using a restriction enzyme, producing fragments with cohesive ends. These fragments are sequenced using high-throughput techniques, to obtain reads of the regions bordering the cut site (Davey & Blaxter, 2010). The obtained data can be mapped to a reference genome or a de novo reconstruction can be carried out. Both single nucleotide polymorphisms (SNPs) and indels (small insertions or deletions) can be identified using RAD-seq. When applied to individual samples, the technique can identify homo- and heterozygous individuals, haplotypes, and allelic frequencies (Davey & Blaxter, 2010). RAD-seq can also be used to determine allelic and allele-type frequencies (haplotypes by population) when combined with the pool-seq strategy.
WGS (whole genome sequencing)
A method to determine the complete genome sequence of an organism, often including the nuclear, mitochondrial, and/or chloroplast genomes. To obtain the complete genome sequence, variosu sequencing techniques can be used, including second- and third-generation sequencing, with short reads, long reads, or both.
Pool-seq (sequencing of pooled samples)
This technique uses pooled individuals for sequencing. For example, pooled samples can be used for complete genome analysis or reduced representation analysis with RAD-seq or RNA-seq. The main advantage is the cost reduction of sequencing, yet still providing good precision in the estimate of allelic frequencies of populations (Schlötterer et al., 2014). Some precautions should be considered when using this approach: (1) consideration of the pool size of the individuals, so that it represents the diversity to be captured, (2) sequencing depth, and (3) the type of research question being addressed.
RNA-seq (sequencing of RNA)
A methodology to sequence complementary DNA synthesized from RNA and isolated from a cell, tissue, or organism at a particular time and/or under specific treatments. This methodology displays the transcriptional profiles of mRNA, rRNA, tRNA, and small RNAs. As it does not necessarily require a reference genome, it can be applied to wild populations in non-model species.
Targeted capture
A technology where: (1) probes complementary to the genes of interest are designed, (2) a library of expressed transcripts is prepared, and (3) the transcripts captured by the probes are sequenced by high-throughput technologies.
BS-seq (bisulfite sequencing)
A tool to detect methylated cytosine in genomic DNA by treating the DNA with sodium bisulfite, which converts non-methylated cytosine into uracil. This methodology requires that both non-treated DNA and DNA treated with bisulfite be sequenced to compare sequences and determine the localization of the methylated bases.
GWAS (genome-wide association studies)
An approach for the identification of genetic variants associated with specific traits. It usually compares groups of individuals with a specific trait of interest that varies among them. Then, the genetic variants (generally SNPs) are searched for in the genome and the allelic frequencies among the various study groups are determined.
QTLs (quantitative trait loci)
QTLs correspond to loci that are associated with quantitative traits (i.e., traits that do not follow patterns of Mendelian inheritance). This methodology requires known genomes, as the molecular markers found must be mapped in order to associate them with genomic regions of known function, so a relationship with certain phenotypic traits can be established. More recently, expression QTLs (eQTLs) have been developed; they correspond to loci that explain part of the variation in mRNA levels of expression.
CRISPR/Cas9 technology
CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats-associated protein Cas9) is a method for genome editing. The system is derived from the microbial immune system. It generally involves endonuclease Cas9 and a single-guide RNA (sgRNA). The sgRNA is complementary to the region to be edited and the endonuclease will produce cuts in both DNA strands. Due to the damage produced, the cell machinery will attempt to repair the cuts, resulting in indel mutations on the region. If editing is performed on the germinal cell lines, the mutation is potentially inheritable (Bono et al., 2015).
Open Research
DATA AVAILABILITY STATEMENT
Data sharing not applicable - no new data generated, or the article describes entirely theoretical research