Volume 27, Issue 18 pp. 3582-3598
ORIGINAL ARTICLE
Full Access

A genome scan of diversifying selection in Ophiocordyceps zombie-ant fungi suggests a role for enterotoxins in co-evolution and host specificity

Noppol Kobmoo

Corresponding Author

Noppol Kobmoo

Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Orsay, France

National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Development Agency (NSTDA), Klhong Luang, Thailand

Correspondence

Noppol Kobmoo, BIOTEC, NSTDA, Thailand Science Park, Khlong Neung, Khlong Luang, 12120 Pathum Thani, Thailand.

Email: [email protected]

Search for more papers by this author
Duangdao Wichadakul

Duangdao Wichadakul

Chulalongkorn University Big Data Analytics and IoT Center (CUBIC), Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand

Center of Excellence in Systems Biology, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand

Search for more papers by this author
Nuntanat Arnamnart

Nuntanat Arnamnart

National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Development Agency (NSTDA), Klhong Luang, Thailand

Search for more papers by this author
Ricardo C. Rodríguez De La Vega

Ricardo C. Rodríguez De La Vega

Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Orsay, France

Search for more papers by this author
Janet J. Luangsa-ard

Janet J. Luangsa-ard

National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Development Agency (NSTDA), Klhong Luang, Thailand

Search for more papers by this author
Tatiana Giraud

Tatiana Giraud

Ecologie Systématique Evolution, Université Paris-Sud, CNRS, AgroParisTech, Université Paris-Saclay, Orsay, France

Search for more papers by this author
First published: 27 July 2018
Citations: 22

Abstract

Identification of the genes underlying adaptation sheds light on the biological functions targeted by natural selection. Searches for footprints of positive selection, in the form of rapid amino acid substitutions, and the identification of species-specific genes have proved to be powerful approaches to identifying the genes involved in host specialization in plant-pathogenic fungi. We used an evolutionary comparative genomic approach to identify genes underlying host adaptation in the ant-infecting genus Ophiocordyceps, which manipulates ant behaviour. A comparison of the predicted genes in the genomes of species from three species complexes—O. unilateralis, O. australis and O. subramanianii—revealed an enrichment in pathogenesis-associated functions, including heat-labile enterotoxins, among species-specific genes. Furthermore, these genes were overrepresented among those displaying significant footprints of positive selection. Other categories of genes suspected to be important for virulence and pathogenicity in entomopathogenic fungi (e.g., chitinases, lipases, proteases, core secondary metabolism genes) were much less represented, although a few candidate genes were found to evolve under positive selection. An analysis including orthologs from other entomopathogenic fungi in a broader context showed that positive selection on enterotoxins was specific to the ant-infecting genus Ophiocordyceps. Together with previous studies reporting the overexpression of an enterotoxin during behavioural manipulation in diseased ants, our findings suggest that heat-labile enterotoxins are important effectors in host adaptation and co-evolution in the Ophiocordyceps entomopathogenic fungi.

1 INTRODUCTION

The identification of genes underlying adaptation is a major goal in evolutionary biology, as it can shed light on the biological functions targeted by natural selection and the genetic mechanisms generating new, adaptive variants. Innovation may be generated during evolution by gene duplication followed by rapid amino acid substitutions in one of the copies (Ohno, 1970; Zhang, Zhang, & Rosenberg, 2002). Gene losses can also be adaptive (Juárez-Vázquez et al., 2017), particularly in pathogens, as the absence of a molecule recognized by the host may enable the pathogen to colonize its host without triggering a response from the host immune system (Albalat & Cañestro, 2016; Ghanbarnia et al., 2015; Rouxel & Balesdent, 2017). Gene duplications and losses result in the presence of species-specific genes, which are often overrepresented among the genes involved in adaptation (Gladieux et al., 2014; Lespinet, Wolf, Koonin, & Aravind, 2002; Zhou et al., 2015). Adaptation may also occur through positive selection, with rapid amino acid substitutions, typically detected as higher rates of nonsynonymous substitutions (dN) than of synonymous substitutions (dS) among orthologous genes of closely related species (Ina, 1996; Kimura, 1983). Comparisons of dN/dS ratios to neutral expectations therefore also constitute a powerful approach to identifying genes under recurrent positive selection (Yang & Nielsen, 1998; Yang, Nielsen, Goldman, & Krabbe Pedersen, 2000; Yang, Wong, & Nielsen, 2005).

Pathogens are particularly interesting models for investigations of the genomic mechanisms of adaptation, as they are locked in an arms race with their hosts, leading to continuous, rapid evolution (Anderson et al., 2010; Kurtz, Schulenburg, & Reusch, 2016). Identification of the genes underlying host-specific adaptations in pathogens improves our fundamental understanding of natural selection and evolution, but it also has more applied implications, shedding light on major epidemics and disease emergence in plants and animals (Möller & Stukenbrock, 2017).

Fungi are the principal pathogens of plants (Anderson et al., 2004), and they also represent threats to the health of many animals (Fisher et al., 2012; Sexton & Howlett, 2006). Many studies have searched for genes under positive selection as a means of identifying genes and functions involved in the species-specific adaptation of fungal pathogens of plants (Aguileta, Refrégier, Yockteng, Fournier, & Giraud, 2009; Möller & Stukenbrock, 2017). For example, in the Microbotryum and Botrytis fungal plant pathogens, many such genes have been identified through comparative transcriptomics studies as being under recurrent positive selection and they were involved in biological processes important for the recognition and cell signalling between the host and the pathogen (Aguileta et al., 2010, 2012). More recent, next-generation sequencing made it possible to perform genomewide scans in plant-pathogenic fungi, resulting in the identification of an array of effectors under positive selection (Badouin et al., 2017; Poppe, Dorsheimer, Happel, & Stukenbrock, 2015; Schirrmann et al., 2018; Stukenbrock et al., 2011; Wicker et al., 2013) and species- or lineages-specific genes underlying adaptations (Baroncelli et al., 2016; Hartmann, Rodríguez de la Vega, Brandenburg, Carpentier, & Giraud, 2018).

By contrast, far fewer such studies have been performed on entomopathogenic fungi (Wang & Wang, 2017), despite the importance of identifying genes underlying host-specific adaptation for the use of these fungi as biological control agents against insect pests in agriculture (Wang & Feng, 2014). Furthermore, an understanding of host specificity and evolution in these insect pathogens is of fundamental interest in its own right, particularly for fungi able to manipulate the behaviour of the insect host for their own benefit, as in the “zombie-ant” phenomenon. Most genomic studies on entomopathogenic fungi have focused on species with agricultural applications such as Beauveria bassiana and Metarhizium anisopliae (Gao et al., 2011; Hu et al., 2014; Pattemore et al., 2014). However, these species have broad host ranges and may not, therefore, be the best models in which to study the genomics of host specialization and co-evolution. Specialist Metarhizium strains and species generally have fewer genes, with notably fewer genes encoding host-killing toxins (Wang, Leclerque, Pava-Ripoll, Fang, & St. Leger, 2009), but more genes evolving under positive selection than generalists (Hu et al., 2014). However, it remains unclear how adaptation has shaped the genomes of closely related fungal entomopathogens specializing on different hosts.

We therefore tried to identify genes involved in host specificity in three complexes of closely related species from the genus Ophiocordyceps (Hypocreales, Ascomycota): O. unilateralis sensu lato, O. subramanianii s.l. and O. australis s. l. One of the key features of these pathogens is their ability to manipulate their hosts to promote their own dispersal. Infected ants, often described as “zombie ants,” leave their nests and develop erratic behaviour, wandering alone into vegetation and then biting into a leaf located at a precise height and orientation optimal for subsequent fungal dispersal just before they die. Fungal spores produced from the diseased ant are thus dispersed farther, from a height (de Bekker, Ohm, Evans, Brachmann, & Hughes, 2017; Hughes et al., 2011, 2016; Pontoppidan, Himaman, Hywel-Jones, Boomsma, & Hughes, 2009). Ophiocordyceps unilateralis s.l. is a highly diverse complex of pathogenic cryptic species specific to formicine ants. It is distributed worldwide, and many species occur together in sympatry while displaying strong host specificity (Araújo, Evans, Kepler, & Hughes, 2018; Evans, Elliot, & Hughes, 2011; Kobmoo, Mongkolsamrit, Tasanathai, Thanakitpipattana, & Luangsa-Ard, 2012; Kobmoo et al., 2015). Ants develop erratic behaviour only when infected with their specific pathogen species (de Bekker et al., 2014; Sakolrak, Blatrix, Sangwanit, Arnamnart, & Kobmoo, 2018). The taxonomy and phylogeny of the other ant-manipulating Ophiocordyceps species complexes have been studied in less detail, but host specificity is also considered to be the rule for these other taxa (Araújo et al., 2018).

We conducted a comparative genomic study of ant-infecting Ophiocordyceps species, with the aim of identifying genes underlying host specificity by searching for species-specific genes and genes evolving under positive selection. We sequenced the genomes of two closely related species of the O. unilateralis complex from Thailand: O. camponoti-leonardi and O. camponoti-saundersi, specific to the ants Colobopsis leonardi and C. saundersi, respectively. We also improved the available genome assembly of another species of this complex, O. polyrhachis-furcata, specific to Polyrhachis furcata (Wichadakul et al., 2015), and used the published genomes of other ant-infecting Ophiocordyceps species (de Bekker et al., 2017): one genome of each of two species of O. unilateralis s.l., O. kimflemingiae from the United States infecting Camponotus castaneus (Araújo et al., 2018; de Bekker et al., 2015) and O. camponoti-rufipedis from Brazil specific to C. rufipes (Araújo et al., 2018; Evans et al., 2011); one genome of O. subramanianii s.l. from a ponerine ant in Ghana; one genome of each of two strains of O. australis s.l. found on different ponerine ant species, from Ghana and Brazil, probably belonging to different cryptic species (de Bekker et al., 2017).

The entomopathogenic fungi of the order Hypocreales are known to infect their host by penetrating the cuticle (Boomsma, Jensen, Meyling, & Eilenberg, 2014). This process requires an array of proteinases, lipases and chitinases. The acquisition of nutrients from the host requires proteases and glycoside hydrolases, including trehalases in particular, as trehalose is a major carbon source present in the insect haemolymph (Thompson, 2003). Secondary metabolites, including toxins, help to combat the host immune system and eventually kill the insect (Ortiz-Urquiza, Riveiro-Miranda, Santiago-Álvarez, & Quesada-Moraga, 2010; Schrank & Vainstein, 2010). Ophiocordyceps polyrhachis-furcata has a more extensive family of genes encoding putative heat-labile enterotoxins than other specialist entomopathogenic fungi (Wichadakul et al., 2015), and some of these genes are expressed during host-specific behavioural manipulation. Heat-labile enterotoxins may, therefore, act as neuromodulators (de Bekker et al., 2015). We hypothesized that enterotoxin-coding genes would be under recurrent positive selection in ant-manipulating Ophiocordyceps fungi, as they have probably been involved in co-evolution with the host and in host-specific adaptation. Small proteins secreted by fungal pathogens are often involved in interactions with the host (Barrett & Heil, 2012; Rafiqi, Ellis, Ludowici, Hardham, & Dodds, 2012). We therefore conducted genome scans for positive selection and focused on the heat-labile enterotoxin gene family and small secreted proteins. We conducted formal tests for positive selection (statistical comparisons of models of evolution with and without diversifying selection). As such tests detect only highly recurrent and rapid positive selection, we also investigated the 5% of genes with the highest dN/dS values. High dN/dS values, even if below 1, may be indicative of positive selection at a few sites in the protein, although they may also result from relaxed selection. In several classes of genes thought to be important for virulence and pathogenicity in entomopathogenic fungi (e.g., chitinases, lipases, proteases, small secreted proteins), only a few genes showed signs of selection or species specificity. By contrast, we found that heat-labile enterotoxins were overrepresented among both the species-specific genes and the genes with significant footprints of positive selection. An analysis including enterotoxin-encoding genes from other entomopathogenic fungi (Hypocreales), that do not manipulate host behaviour, showed that positive selection was specific to the ant-infecting genus Ophiocordyceps. These findings suggest that heat-labile enterotoxins are important effectors involved in host adaptation and co-evolution in entomopathogenic Ophiocordyceps fungi.

2 MATERIALS AND METHODS

2.1 Sampling and sequencing

In 2015, we collected a sample of O. camponoti-leonardi (strain NK511ss-8) from Kalayaniwattana district, in Chiang Mai province in Thailand, and a sample of O. camponoti-saundersi (strain NK405ss-6) from the Phu Kiew National Park, in Chaiyaphum province. We used the reference genome of O. polyrhachis-furcata (strain BCC54312) (Wichadakul et al., 2015); we aimed to improve the existing reference genome, but the original strain BCC54312 could not be grown from the culture collection. We therefore collected three additional samples of this species (strains NK275ss-12, NK142ss and NK294ss-20), in 2013 and 2014, from the same site as the reference strain, in Khao Yai National Park, Nakhon Ratchasima Province. The collected samples were isolated and grown as described by Wongsa, Tasanatai, Watts, and Hywel-Jones (2005). We complied with the Nagoya protocols on access and benefit-sharing, by obtaining authorization from the Department of National Parks, Wildlife and Plant Conservation (DNP) at the Ministry of Natural Resources and Environment of Thailand for all strain collections. After two to three months of growth on Grace Insect Cell Medium (Sigma-Aldrich), the mycelia and spores were harvested and DNA was extracted with the NucleoSpin® Soil kit (Macherey-Nagel). The long incubation period is due to the fact that O. unilateralis species in Thailand are very fastidious to grow, requiring few steps of enlarging the culture scale to a sufficient level for DNA extraction. Genomic libraries were constructed (150-bp paired-end reads) for sequencing with an Illumina HiSeq3000 machine at the GenoToul platform (Toulouse, France).

2.2 Read pretreatment, de novo assembly and improvement of the reference genome

The raw reads were trimmed to remove adapters and low-quality bases from their ends (< 20). Duplicate reads were removed using Picard Tools MarkDuplicate. The reference genomes for O. camponoti-leonardi and O. camponoti-saundersi were assembled de novo with SPAdes (Bankevich et al., 2012), which progressively integrates k-mers of increasing size. The k-mer sizes used were 21, 33, 55, 77, 99, 119 and 127 for NK405ss-6, and 21, 33, 55, 77, 99 and 115 for NK511ss-8. The appropriate maximum k-mer sizes were estimated with Kmergenie (Chikhi & Medvedev, 2014).

The reads obtained for the new O. polyrhachis-furcata samples were used to fill gaps in the existing reference genome of this species with GapFiller (Boetzer et al., 2012), which mapped the reads onto the reference sequence over the regions flanking the gaps and identified a consensus between reads overlapping the gaps. In total, 175 of 3,915 gaps were closed (identifying around 1.6 Mb from a total gap length of 2.4 Mb in the reference genome).

2.3 Gene prediction and functional annotation

Gene prediction was based exclusively on scaffolds of more than 1 kb in length and involved a two-round approach based on MAKER (Cantarel et al., 2008). Gene sets were initially predicted with CEGMA (Parra, Bradnam, & Korf, 2007) and GeneMark-ES (Lomsadze, Ter-Hovhannisyan, Chernoff, & Borodovsky, 2005) and were then used as inputs into MAKER for the first round of prediction. The predicted proteins and transcripts identified in previous studies on O. polyrhachis-furcata (Wichadakul et al., 2015) were also used as a training set for MAKER. The predicted gene set from this first round was then fed into SNAP (Korf, 2004) and Augustus (Keller, Kollmar, Stanke, & Waack, 2011). The output of these two tools was then fed back into MAKER for a second round of prediction.

The predicted proteins were annotated with InterProScan 5 (Jones et al., 2014), which also associated the protein domains detected with sequences in the Pfam (Finn et al., 2016) and KEGG (Kanehisa, Sato, Kawashima, Furumichi, & Tanabe, 2016; Ogata et al., 1999) databases and with Gene Ontology (GO) terms. Small secreted proteins (SSPs) were identified as proteins of <300 amino acids with signal peptides but no transmembrane signature detected by SignalP (Petersen, Brunak, von Heijne, & Nielsen, 2011). Proteolytic enzymes were also annotated, by Blast analysis of the predicted proteins against the MEROPS database (Rawlings, Barrett, & Finn, 2016) with a threshold e-value of 1e−20. Enzymes with activity against carbohydrates were annotated with dbCAN (Yin et al., 2012); only genes with Pfam domains consistent with those of the CAZYme database were retained. InterProScan recognized glycoside hydrolase domains for some genes that were not detected by the dbCAN analysis toolkit; these genes were also retained. The core genes of secondary metabolic gene clusters (SMGCs) were predicted with SMURF (Khaldi et al., 2010), based on Pfam and Tigrfam domains and on gene positions on scaffolds. SMGCs were predicted with the fungal version of antiSMASH (Weber et al., 2015). SMGC homology across species was inferred with BiG-SCAPE (Navarros-Munõz J., https://git.wageningenur.nl/medema-group/BiG-SCAPE/wikis/home), which classified SMGCs into families based on Jaccard similarity indices between clusters. RepeatMasker was used to predict repetitive elements for the three species from Thailand.

2.4 Orthology and phylogenomics

In addition to the predicted proteins from the de novo assembled and improved genomes of O. unilateralis species from Thailand, we also included in our analyses the predicted proteins of other ant-infecting Ophiocordyceps fungi specific to different ant species and originating from different geographic areas (de Bekker et al., 2017). We used the available genomes from two additional O. unilateralis s.l. species (O. kimflemingiae from the United States and O. camponoti-rufipedis from Brazil), from two cryptic species of O. australis s.l., from Ghana and Brazil (de Bekker et al., 2017), and from O. subramanianii s.l., also from Ghana. The predicted proteins corresponding to all these genomes were subjected to Blast comparisons with each other, with a significance threshold e-value of 1e−5. The Blast results were used as input for orthAgogue (Ekseth, Kuiper, & Mironov, 2014), a tool for the rapid inference of orthologous groups with the Markov clustering algorithm (MCL, Dongen, 2000). This algorithm recovers species-specific paralogous groups, with genes from a given species considered to be more closely related to each other than to any other gene in any other species. The functional annotations obtained for O. polyrhachis-furcata were transferred to the other species for gene copies in the same orthologous group. Species-specific paralogous genes were annotated as described above. We analysed GO term enrichment among species- or complex-specific paralogs, with the topgo package in r (Alexa & Rahnenfuhrer, 2016). Fisher's exact tests were used to compare gene counts between paralogous species-specific or complex-specific groups and the whole gene set for the species or complex, respectively.

Sequences within all orthologous groups were aligned with macse (Ranwez, Harispe, Delsuc, & Douzery, 2011) for further analyses. A phylogenetic tree with bootstrap support was constructed according to the GTRCAT model under raxml-hpc v8.1.5 (Stamatakis, 2014), exclusively with the nucleotide sequences for all the one-to-one orthologous groups in which each species was represented.

2.5 Detection of positive selection

Pairwise ratios of nonsynonymous-to-synonymous substitutions (dN/dS) were calculated between species (Yang & Nielsen, 2000) with the yn00 program implemented in paml v.4.8a (Yang, 2007). The mean pairwise dN/dS ratios were calculated and used to assess variation across single-copy orthologous groups with at least four species represented. Groups with dS < 0.01, potentially resulting in inaccurate dN/dS estimates, and groups with excessively high dN/dS (>10) ratios, were discarded. The functions overrepresented among the 5% of genes with the highest dN/dS ratios were inferred by an analysis of enrichment in GO terms. A mean dN/dS>1 for a given gene indicates positive selection, whereas high dN/dS values below 1 can be due to positive selection on a small number of sites within the protein or to relaxed selection.

We also formally tested for positive selection by performing site-model likelihood ratio tests (LRTs) with the codeml program implemented in paml v.4.8a (Yang, 2007), excluding gaps and ambiguous sites and using trees inferred under GTRCAT model from respective orthologous groups. codeml estimates the parameter omega (ω = dN/dS) by maximum-likelihood methods, allowing variation between sites. While the pairwise measures above only approximate synonymous and nonsynonymous rates, likelihood ratio tests (LRTs) statistically compare two models of evolution, one in which ω < 1 (null model) at all sites and another in which ω > 1 at some sites (alternative hypothesis of positive selection); LRTs thus indicate whether a model with positive selection is more likely than a model without positive selection. We compared the M7 (beta distribution of ω) and M8 (beta distribution of ω with a proportion of sites with ω > 1; Nielsen & Yang, 1998; Yang et al., 2000) models, and the M8a (similar to M8 but with a category of sites evolving with ω = 1) and M8 (Swanson, Nielsen, & Yang, 2003) models in LRTs. Only genes with a p-value below 0.05 after false-discovery rate (FDR) correction were considered significant. The M7 vs. M8 test is known to lack robustness when the probability mass is located around ω = 1, in which case this test gives a high proportion of false positives; under these conditions, the M8a vs. M8 test is preferred (Swanson et al., 2003). We ensured the robustness of our results by considering only genes in which significant evolution under positive selection was detected in both tests. We checked for enrichment in particular GO terms among the genes evolving under positive selection.

We also investigated whether genes encoding heat-labile enterotoxins evolved under positive selection specifically in ant-infecting Ophiocordyceps and not in other Hypocrealean fungi. We therefore downloaded predicted gene sequences from other Hypocrealean fungi that were annotated as putative heat-labile enterotoxins from the Ensembl Genome database (Herrero et al., 2016). Putative heat-labile enterotoxin genes were retrieved for 14 entomopathogenic fungi (one strain per species) (Supporting Information Table S1): Metarhizium anisopliae ARSEF23 (24 genes), M. acridum CQMa 102 (three genes) (Pattemore et al., 2014); M. album ARSEF1941 (12 genes), M. brunneum ARSEF3297 (32 genes), M. guizhouense ARSEF977 (32 genes), M. majus ARSEF297 (32 genes) (Hu et al., 2014); M. rileyi RCEF4871 (three genes), Isaria fumosorosea ARSEF2679 (five genes), Aschersonia aleyrodis RCEF2490 (14 genes), Cordyceps confragosa RCEF1005 (six genes), C. brongniartii RCEF3172 (30 genes) (Shang et al., 2016), Cordyceps militaris CM01 (one gene, Zheng et al., 2011), Beauveria bassiana ARSEF2860 (six genes, Xiao et al., 2012); and Ophiocordyceps sinensis Co18 (13 genes, Xia et al., 2017). We also included putative heat-labile enterotoxin sequences from two nematode-killing fungi: Purpureocillium lilacinum PLBJ-1 (two genes, Wang et al., 2016) and Pochonia chlamydosporia 170 (four genes). Orthologs between these sequences and the putative enterotoxins of O. unilatealis species studied here were identified. The occurrence of clade-specific positive selection in O. unilateralis was assessed with branch-model LRTs in PAML (Yang, 1998; Yang & Nielsen, 1998) and with the BUSTED test, an alignment-wide test of episodic positive selection (Murrell et al., 2015). Both these tests are log-likelihood ratio tests comparing a model in which positive selection is allowed in the foreground branches (i.e., the clade of interest) to the null model in which positive selection is not allowed. The branch model (Yang & Nielsen, 1998), as implemented in PALM, detects positive selection by allowing a candidate clade to have a dN/dS ratio higher than those of the other branches (background branches) without taking into account variation between sites or allowing variation between branches of the same category. By contrast, BUSTED is a stochastic test using information from all sites and branches; it is therefore considered to have greater statistical power (Murrell et al., 2015).

3 RESULTS

3.1 General genome features

The three reference genomes of closely related species sequenced here differed considerably in size, O. camponoti-saundersi (OCS) being the largest (49.26 Mb), followed by O. polyrhachis-furcata (OPF) (43.25 Mb) and O. camponoti-leonardi (OCL) (37.91 Mb). These differences probably partly reflect methodological differences as the OPF genome is an improved version of a genome sequenced with a different technology (454 pyrosequencing combined with Illumina mate-pair sequencing, Wichadakul et al., 2015). OCL and OCS were sequenced and assembled with the same methodology, so the observed differences probably reflect genuine differences in genome size. OCS also had more scaffolds (1700) than OCL (531). OPF had fewer scaffolds and larger contigs, due to the use of variable-size mate-pair libraries (Wichadakul et al., 2015) (Table 1). These genomes are markedly larger than those reported for O. kimflemingiae (OKi: 23.91 Mb), O. camponoti-rufipedis (OCR: 21.91 Mb), O. australis s.l. from Brazil (OAB: 23.32 Mb) and from Ghana (OAG: 22.19 Mb), and O. subramanianii s.l. (OSS: 32.31 Mb), but all these previously published genomes were more fragmented than our assemblies (Table 1).

Table 1. Genome summary statistics for the ant-infecting Ophiocordyceps species used in this study
Species (sample name) OPF (BCC54312) OCL (NK511ss-8) OCS (NK405ss-6) OKi (SC16a) OCR (Map-16) OAB (Map-64) OAG (1348a) OSS (1346)
Genome size in Mb (scaffolds >1 kb) 43.25 37.91 49.26 23.91 21.90 23.32 22.19 32.30
Number of scaffolds (>1 kb) 68 531 1,700 1,64 2,204 595 2,296 3,395
Largest scaffold (kb) 5,272.94 574.15 755.06 167.40 146.68 427.81 117.86 138.81
N50 (kb) 2,974.013 139.47 102.43 26.91 23.06 111.99 17.42 17.59
GC content (%) 45.03 45.88 40.13 55.92 56.1 53.13 53.48 60.35
Number of Ns per 100 kb 5,426.84 11.32 15.22 739.17 13.02 403.43 554.75 376.08
Number of protein-coding genes 8,988 7,059 6,970 8,629 7,621 8,174 7,995 11,275
Number of exons per gene 3.57 3.00 2.98 3.00 2.00 2.00 2.00 2.00
Exon length (median) 146 303 303 220 273 268 290 266
Core eukaryotic gene mapping (CEGMA) completeness (%) 95.56 95.16 95.97 99.13 98.69 99.13 98.25 98.47
Repetitive content (% of the genome) 5.23 5.41 5.65 6.83 6.59 2.87 2.45 4.06
Number of genes with SignalP 716 811 761 914 840 802 681 1,064
Number of small secreted proteins (SSPs) 270 252 239 373 802 776 648 1,027

Notes

  • OAB: Ophiocordyceps australis from Brazil; OAG: Ophiocordyceps australis from Ghana; OCL: Ophiocordyceps camponoti-leonardi; OCR: Ophiocordyceps camponoti-rufipedis; OCS: Ophiocordyceps camponoti-saundersi; OKi: Ophiocordyceps kimflemingiae; OPF: Ophiocordyceps polyrhachis-furcata; OSS: Ophiocordyceps subramanianii. aImproved from Wichadakul et al. (2015). bTaken from de Bekker et al. (2017).

Despite the differences in genome size, the numbers of predicted genes were of a same order of magnitude across species (Table 1), although the number of predicted genes was nevertheless largest for OSS. For the three species from Thailand, OPF had the largest number of predicted genes, probably because the protein and transcript training set used for prediction came from this species. The number of SSPs was similar between the three Thai species. The number of genes with assigned Pfam domains or InterPro classification and the complete predicted gene sets obtained by core eukaryotic genes mapping (CEGMA) were also very similar in the three species (~95%: Table 1), but smaller than those for species from the New World (~99%).

3.2 Orthology and phylogenomics

The genomes used in this study were sequenced from individuals belonging to one of the three species complexes: O. unilateralis s.l., O. australis s.l. and O. subramanianii s.l. Most of the genes were common to all three complexes (Figure 1a): 8,554 orthologous groups were retrieved, 5,718 of which were common to all complexes. For orthologous groups present in only one of the three complexes (Supporting Information Table S2), pathogenesis (GO:0009405) was the function displaying the most significant enrichment in all complexes (Bonferroni-corrected p-values: 2e−10 for O. unilateralis s.l., 0.016 for O. australis s.l., 3.4e−5 for O. subramanianii s.l.), mostly due to the presence of genes encoding putative heat-labile enterotoxins in these complex-specific genes. Complex-specific genes were also found to be enriched in interspecies interactions and multi-organism process functions.

Details are in the caption following the image
Inference of orthologous groups: Venn diagram showing the number of orthologous groups common to and specific to species complexes and species a. between the three ant-infecting Ophiocordyceps species complexes used in this study; (b) between the species in the O. unilateralis complex (OPF = O. polyrhachis-furcata, OCL = O. camponoti-leonardi, OCS = O. camponoti-saundersi, OKi = O. kimflemingiae, OCR = O. camponoti-rufipedis), (c) between the species in the O. australis complex (OAG = O. australis from Ghana, OAB = O. australis from Brazil) [Colour figure can be viewed at wileyonlinelibrary.com]

Within each species complex, most of the genes were common to several species (Figure 1b,c). The function pathogenesis was found to be overrepresented among species-specific genes (Supporting Information Tables S3 and S4), due to the presence of genes encoding heat-labile enterotoxins, and SSPs (Tables 2 and 3). In particular, we detected an overrepresentation of SSPs among the genes unique to O. kimflemingiae (p-value = 0.003) relative to O. unilateralis s.l. complex, and among the genes unique to O. australis from Brazil relative to O. australis s.l. complex (p-value = 0.001). None of these species-specific SSPs had a predicted function, suggesting an expansion of rapidly evolving families of effectors (Kim et al., 2016).

Table 2. Characteristics of orthologous groups specific to different species among the Ophiocordyceps unilateralis sensu lato complex
Species Number of specific orthologous groups Number of predicted genes (number of genes with Pfam domains/InterPro classification) Enriched functions (GO term; FDR p-value) Species-specific SSPs/genes, whole-genome SSPs/genes (p-value for enrichment analysis of SSPs)
O. polyrhachis-furcata 61 519 (25)

Pathogenesis (GO:0009405;0.0054)

Interspecies interaction between organisms (GO:0044419; 0.0021)

Multiorganism process (GO:0051704; 0.0021)

18/519, 270/7678 (0.60)
O. camponoti-leonardi 9 9 (8) - -
O. camponoti-saundersi 7 9 (6) - -
O. kimflemingiae 169 185 (83) - 17/185,373/7457(0.003)
O. camponoti-rufipedis 103 122 (49)

Pathogenesis (GO:0009405;0.023)

Interspecies interaction between organisms (GO:0044419; 0.023)

Multiorganism process (GO:0051704; 0.023)

17/122,802/6868(0.396)

Note

  • SSPs: small secreted proteins.
Table 3. Characteristics of orthologous groups specific to different species among the Ophiocordyceps australis sensu lato complex
Species Number of specific orthologous groups Number of predicted genes (number of genes with Pfam domains/InterPro classification) Enriched functions (GO term; FDR p-value) Species-specific SSPs/genes, whole-genome SSPs/genes (p-value for enrichment analysis of SSPs)
O. australis Ghana 150 173 (88) - 14/173,648/7414 (0.892)
O. australis Brazil 339 356 (201)

Pathogenesis (GO:0009405; 1.07e−5)

Interspecies interaction between organisms (GO:0044419; 1.07e−5)

Multiorganism process (GO:0051704; 1.07e−5)

66/356,776/7558 (0.001)

Note

  • SSPs: small secreted proteins.

There were 4,651 orthologous groups common to all eight genomes. We used a subset of 4,014 single-copy orthologous groups common to all species to construct a phylogenetic tree (Figure 2). This tree recovered the expected relationships between the sibling species from Thailand, with O. polyrhachis-furcata being the most closely related to O. camponoti-leonardi (Kobmoo et al., 2012, 2015); the species from the Americas, O. kimflemingiae and O. camponoti-rufipedis, clustered together but were separate from those from Thailand, corresponding to the separation between the Old and New Worlds observed in a previous study (Evans, Araújo, Halfeld, & Hughes, 2018). The two O. australis s.l. species were grouped together and formed, with O. subramanianii, an outgroup to the O. unilateralis complex.

Details are in the caption following the image
The best maximum-likelihood tree based on 4,014 single-copy orthologous groups with bootstrap supports. The horizontal scale bar represents the branch length based on substitution rates

3.3 Variation of dN/dS across genomes and putative functions

The median pairwise dN/dS ratio was 0.081, indicating that most single-copy orthologs evolved under strong purifying selection (Figure 3a). No orthologous group had dN/dS > 1 (Supporting Information Table S5). We investigated the putative functions of the 5% of genes with the highest dN/dS values (297 genes) (Supporting Information Table S5), even if these ratios were below 1, as this could be indicative of positive selection at a small number of sites in the protein, although relaxed selection cannot be excluded for dN/dS <1. Fisher's exact tests for enrichment in GO term annotation among the 5% of genes with the highest dN/dS values relative to all single-copy orthologous groups (i.e., dN/dS > 0.253) indicated that the function pathogenesis, interspecies interaction between organisms and multi-organism process (respective Bonferroni-corrected p-value = 3.4e-17, 2.2e-18 and 5.5e-180.005) were the most significantly overrepresented. The genes annotated with the pathogenesis (GO:0009405) GO term and with a dN/dS ratio in the top 5% were all putative heat-labile enterotoxin. Twenty orthologous groups annotated as heat-labile enterotoxins were found in one-to-one orthologous groups common to at least four species, thirteen of which had dN/dS values in the top 5%. Enterotoxins had a significantly higher mean dN/dS than the genome as a whole (t test: Bonferroni-corrected p-value = 5.19e−08) (Figure 3b). The GO terms relating to glycoside hydrolases (GH families: chitinases GH18, trehalases GH37), proteases and lipases, essential to the primary functions of entomopathogenic fungi, such as nutrient acquisition and host cell penetration, were not overrepresented among the 5% of genes with the highest dN/dS values. The mean dN/dS values for these gene families were not significantly higher than that for the whole genome (t tests; chitinases: Bonferroni-corrected p-value = 0.926, lipases: Bonferroni-corrected p-value = 1.00, proteases: Bonferroni-corrected p-value = 1.00, trehalases: Bonferroni-corrected p-value = 1.00) (Figure 3b). None of the core genes in secondary metabolic gene clusters was found among the 5% of genes with the highest dN/dS values, and their mean dN/dS value was not significantly higher than that of other genes (t test: Bonferroni-corrected p-value = 0.885) (Figure 3b). The genes encoding putative SSPs had a significantly higher dN/dS ratio than non-SSP genes (t test: p-value = 2.2e-16) (Figure 3b). This suggests that a higher proportion of genes may evolve under positive selection among SSP-encoding genes than among other genes.

Details are in the caption following the image
Distribution of pairwise nonsynonymous-to-synonymous substitution ratios (dN/dS) for the genes in all single-copy orthologous groups with at least four species of ant-infecting Ophiocordyceps represented. (a) Whole-genome dN/dS distributions, (b) Boxplots of pairwise dN/dS values for the whole genome (small secreted protein-coding genes or SSPs, in blue, vs. non-SSPs, in red) and between different categories of genes suspected a priori to be involved in pathogenesis and virulence, that is, with the putative functions of enterotoxins, core proteins of secondary metabolism (SM), lipases, proteases (including subtilisin-like, trypsin and aspartyl proteases) and trehalases. The dotted line represents the mean dN/dS value for the whole genome (0.145) [Colour figure can be viewed at wileyonlinelibrary.com]

3.4 Formal tests for positive selection

LRTs (likelihood ratio tests comparing models with or without positive selection) were performed as formal genomewide tests for positive selection. Of the 5,950 single-copy orthologous groups common to at least four species, 759 and 244 (12.76% and 4.10%) were found to be evolving under positive selection in comparisons of the models M7 vs. M8 and M8a vs. M8, respectively (Supporting Information Table S6), with 144 genes found to be evolving under positive selection in both tests. The functions of pathogenesis and toxin activity were significantly overrepresented among the annotations for these 144 genes (Table 4). This enrichment in the pathogenesis and toxin activity functions was due entirely to enterotoxin genes. Five to seven (25%–35%) putative heat-labile enterotoxin-coding genes were identified as evolving under positive selection, depending on the evolution model tested (Figure 4 and Supporting Information Table S7). These genes were also predicted to have a function in the extracellular region (GO:0005576), and most were predicted to have signal peptides. Enrichment was also observed for functions relating to kinases (e.g., ATP binding, kinase activity, transferase activities: Table 4); significant results were obtained for 15 to 31 (8.1%–16.8%) kinases in LRTs, and eight kinases gave significant results in both tests (Figure 4 and Supporting Information Table S7). Kinases catalyse the transfer of a phosphate group from a high-energy molecule (ATP) to a substrate and are involved in various cellular processes. The proportion of kinases evolving under positive selection was lower than that of heat-labile enterotoxins (Figure 4), but the numbers of kinases and heat-labile enterotoxins evolving under positive selection were similar, and these two functions were overrepresented among the genes evolving under positive selection. Most of these kinases were annotated as protein kinases, histidine kinases and phosphatidylinositol 3 and 4-kinases. These families of kinases are well known to be involved in cell signalling, essential for pathogen growth and survival and, thus, for pathogenesis and virulence (Lee et al., 2016).

Table 4. Results of the gene ontology (GO) term enrichment analyses for the genes with significant likelihood ratio test (LRT) results for positive selection in both the M7 vs. M8 (Nielsen & Yang, 1998) and M8a vs. M8 (Swanson et al., 2003) comparisons
GO Category GO.ID Term p-value
Biological process GO:0009405 Pathogenesis 0.033
GO:0044419 Interspecies interaction between organisms 0.033
GO:0051704 Multiorganism process 0.033
Molecular function GO:0090729 Toxin activity 5.7e-4
GO:0005524 ATP Binding 0.013
GO:0032559 Adenyl ribonucleotide binding 0.013
GO:0030554 Adenyl nucleotide binding 0.013
GO:0016301 Kinase activity 0.013
GO:0000166 Nucleotide binding 0.013
GO:1901265 Nucleoside phosphate binding 0.013
GO:0016772 Transferase activity, transferring phosphorus-containing group 0.013
GO:0036094 Small molecule binding 0.013
GO:0016773 Phosphotransferase activity, alcohol group as an acceptor 0.013
Cellular compartment GO:0005615 Extracellular space 3.7e-4
GO:0044421 Extracellular region part 5.3e-4
GO:0005576 Extracellular region 5.3e-4
Details are in the caption following the image
Percentages of genes in various functional categories for which likelihood ratio tests (LRTs) for positive selection (M7 vs. M8 and M8a vs. M8) yielded significant results (false-discovery rate-corrected p-value < 0.05). Proteases = subtilisin, trypsin and aspartyl proteases, SM = core genes of secondary metabolites. The total number of genes in each category is indicated above the bars [Colour figure can be viewed at wileyonlinelibrary.com]

The functions relating to hydrolytic enzymes important for pathogenesis (glycoside hydrolases, lipases, proteases) were not overrepresented among the genes evolving under positive selection. Indeed, the proportions of genes in these families found to be under positive selection were markedly smaller than those for heat-labile enterotoxins (Figure 4). Nevertheless, several of the genes from these gene families were found to evolve under positive selection in model tests (Supporting Information Table S7) and can be considered good candidates for involvement in co-evolution and host specificity. One to three of the 11 chitinases (GH18) presented significant footprints of positive selection depending on the evolution model considered, and significant p-values were obtained in all tests for one of these enzymes. Chitinases are involved in the degradation of the insect cuticle, a major component of the insect exoskeleton, and in the degradation and remodelling of fungal cell walls (Adams, 2004; Langner & Göhre, 2016). Other GH families converging to various functions, such as cellulase, glucanase, glucosidase and galactosidase (e.g., GH5, GH16, GH47, GH76), also included a few genes displaying significant tests of positive selection (Supporting Information Table S7). Neither of the two trehalases (GH37), which are thought to play important roles in nutrient acquisition within the host body, displayed significant signs of positive selection.

Zero to four of the nine subtilisin-like (MEROPS family S08 and S53) and trypsinlike proteases (MEROPS family S01), which are considered to act as cuticle-degrading proteases, presented significant footprints of positive selection, depending on the evolution model considered. Zero to one of 16 putative aspartyl proteases (MEROPS family A01) was found to evolve under positive selection following different models. However, none of these proteases yielded significant results in both tests (Figure 4). Two to six of the 39 putative lipases yielded significant p-values in positive selection tests, and only one yielded significant p-values in both tests (Figure 4; Supporting Information Table S7).

One to four of the seven core genes of secondary metabolites displayed significant signatures of positive selection, depending on the evolution model considered (Supporting Information Table S7). The only gene to yield significant p-values in both tests (orthologous group ORTHAg2248, Supporting Information Table S7) encoded a polyketide synthase (PKS)-like protein with a beta-ketoacyl synthase domain. Beta-ketoacyl synthase is involved in fatty acid biosynthesis and has been shown to be involved in the production of polyketide antibiotics in fungi (Beck, Ripka, Siegner, Schiltz, & Schweizer, 1990). The gene encoding this enzyme is part of a secondary metabolic gene cluster that is highly syntenic across the species of the O. unilateralis complex, but located in different clusters in O. australis and in O. subramanianii (Figure 5).

Details are in the caption following the image
Homology of putative secondary metabolic gene clusters (SMGCs) with the core gene under positive selection according to log-likelihood ratio tests (M7 vs. M8 models and M8a vs. M8 models). The dashed lines indicate orthology between the putative polyketide synthase (PKS)-like core gene. The phylogenetic tree was inferred from Jaccard similarity indices between alignments of common gene domains within families. OCS = Ophiocordyceps camponoti-saundersi, OCL = O. camponoti-leonardi, OPF = O. polyrhachis-furcata, OKi = O. kimflemingiae, OCR = O. camponoti-rufipedis, OAG = O. australis from Ghana, OAB = O. australis from Brazil, OSS = O. subramanianii [Colour figure can be viewed at wileyonlinelibrary.com]

We also investigated whether the genes previously identified as encoding possible “neuromodulators” (de Bekker et al., 2017), based on their overexpression during the manipulation of ant behaviour, showed signs of positive selection. In total, 12 to 41 of these genes yielded significant results in tests for positive selection (Supporting Information Table S8). Five genes yielded significant results in both tests. These genes encoded a short-chain dehydrogenase, a DNA mismatch repair protein (MutC), a DNA replication factor, an ATPase and a protein with no functional annotation. Seven other genes yielded results only in the M8a vs. M8 test, which is more robust than the M7 vs. M8 test. These seven genes included oxidoreductases clearly involved in metabolic reactions: a protein with a ferric-reductase transmembrane-like domain, a flavodoxin oxidoreductase and an oxidoreductase binding to a molybdopterin cofactor with a cytochrome b5-like haem/steroid-binding domain. The first of these genes was shown to be associated with iron uptake in yeast (Roman, Dancis, Anderson, & Klausner, 1993), whereas the product of the second mediates iron-free electron transfer. The third of these genes may encode a nitrate reductase or sulphite oxidase, both of which are involved in nitrogen assimilation. These putative neuromodulators thus seem to be involved in host resource utilization. The neurological disorder displayed by zombie ants infected with Ophiocordyceps may result from the pathogen outcompeting the host for iron and nitrogen.

3.5 Positive selection of heat-labile enterotoxin genes specific to the ant-manipulating O. unilateralis species complex

The above results and those of previous studies (de Bekker et al., 2015; Wichadakul et al., 2015) suggest that heat-labile enterotoxin genes are candidate genes for host-specific adaptation. We therefore investigated whether the positive selection detected above was specific to the ant-infecting Ophiocordyceps species or general to Hypocrealean entomopathogenic and nematode-killing fungi. Thirty-six orthologous groups of heat-labile enterotoxin genes were inferred for a group of 16 Hypocrealean entomopathogenic and nematode-killing fungi in addition to our eight focal species (Supporting Information Table S1); 22 of these orthologous groups included at least one sequence from the ant-infecting Ophiocordyceps, and 10 (42%) of these groups included only sequences from the ant-infecting Ophiocordyceps species. We further analysed the only group (ORTHAgEnt13) common to at least four of the ant-infecting Ophiocordyceps species considered and sequences recovered from other species from Hypocreales, for which both site-model LRTs for positive selection were significant. This group included five sequences each from an O. unilateralis species. In a maximum-likelihood tree, all the O. unilateralis sequences were located on the same branch (Figure 6). The PAML branch-model LRTs indicated that this gene was evolving under positive selection specifically in the O. unilateralis clade (p-values < 0.001). The branch at the base and the internal branches of the O. unilateralis clade therefore had significantly higher dN/dS ratios than the other branches (Figure 6). The BUSTED test, which is similar to PAML branch tests but considered more powerful, also gave a significant result (p-value = 6.16e-14).

Details are in the caption following the image
The best RAxML tree based on the GTRCAT model for the orthologous group ORTHAgEnt13 of putative heat-labile enterotoxins in entomopathogenic and nematode-killing fungi of the order Hypocreales. The numbers above the nodes are bootstrap supports. The numbers below the branches are the ratios of nonsynonymous-to-synonymous substitution rates (dN/dS)

4 DISCUSSION

4.1 Enterotoxin genes as major candidate genes underlying host adaptation

Comparative genomic studies of closely related species of fungal pathogens have shown that, in general, genes involved in adaptation, particularly those involved in virulence and pathogenicity, are species-specific, highly divergent and/or under diversifying selection, as a result of the arms race between host and pathogen or specialization on new hosts (Ghanbarnia et al., 2015; Huang, Si, Deng, Li, & Yang, 2014; Plissonneau et al., 2017; Stukenbrock et al., 2011). We therefore used an evolutionary comparative genomic approach for identifying genes underlying host adaptation in ant-infecting Ophiocordyceps from three species complexes (O. unilateralis s.l., O. australis s.l. and O. subramanianii s.l.). Genome comparisons showed that species complex-specific genes were enriched in genes associated with the function pathogenesis which included genes encoding heat-labile enterotoxins. The species-specific genes were also enriched in this function. However, most species-specific genes lacked functional annotation, perhaps due to their rapid evolution as part of the arms race between pathogen and host, resulting in homology no longer being detectable. Most of the small secreted proteins (SSPs), in particular, lacked predicted functions, but these proteins were particularly abundant among the species-specific genes. SSPs may act as effectors in pathogenicity, but the validation of their function requires further studies.

Heat-labile enterotoxin genes were also overrepresented in the orthologous groups with the highest rates of amino acid differences between species, suggesting the occurrence of diversifying selection, which was confirmed by formal tests comparing models with and without positive selection. Furthermore, in the cases in which orthologs of enterotoxin genes were found in other entomopathogenic fungi, we inferred that positive selection was specific to the ant-infecting Ophiocordyceps clade. These findings support the view that heat-labile enterotoxins are effectors involved in host adaptation, as previously suggested based on observations of enterotoxin overexpression during manipulation of the behaviour of the diseased ants (de Bekker et al., 2015) and of the species-specific nature of behavioural manipulation (de Bekker et al., 2014; Sakolrak et al., 2018). The proximal mechanisms via which enterotoxins act during infection and the manipulation of host behaviour remain unclear, but it has been suggested that these molecules interfere with the chemical communication of social insects; bacterial enterotoxins have been shown to affect pheromone production in boll weevils (Wiygul & Sikorowski, 1986, 1991). Alterations to chemical communication may contribute to the modification of behaviour in infected ant hosts.

4.2 Minor role of the cuticle in exerting selective pressure leading to diversifying selection

Hypocrealean entomopathogenic fungi are known to infect their insect hosts by penetrating the cuticle from the outside (Boomsma et al., 2014). An array of hydrolytic enzymes, including chitinases, lipases and proteases, is required to break through the insect cuticle (Ortiz-Urquiza & Keyhani, 2013). Chitins are major constituents not only of insect cuticles, but also of fungal cell walls (Langner & Göhre, 2016) while lipids are a major component of the epicuticle waxy layer (Jarrold, Moore, Potter, & Charnley, 2007; Pedrini, Ortiz-Urquiza, Huarte-Bonnet, Zhang, & Keyhani, 2013). Proteases are important for the penetration of the cuticle by fungi and have been shown to be virulence factors for the infection of insect hosts (Shah, Wang, & Butt, 2005). Subtilisin proteases have been shown to play a particularly important role in regulating insect host specificity through the differential expressions of specific isoforms (Bye & Charnley, 2008; Mondal, Baksi, Koris, & Vatai, 2016). We therefore hypothesized that the genes encoding chitinases, proteases and lipases might have evolved under diversifying selection. However, we found footprints of positive selection for only a few putative genes encoding these enzymes in the ant-infecting Ophiocordyceps species. This challenges the widely accepted view that the insect cuticle, as a major barrier to infections, exerts a strong selective pressure on entomopathogenic fungi, leading to different host ranges (Boomsma et al., 2014; Ortiz-Urquiza & Keyhani, 2013; Wang, Fang, Wang, & St. Leger, 2011; Wang & St. Leger, 2005). Nevertheless, as the fungi in the three ant-infecting complexes considered here are all pathogens of formicine and ponerine ants, our findings do not rule out diversifying selection occurring across larger phylogenetic scales. These enzymes may be highly conserved among pathogens of formicine and ponerine ants, providing a common arsenal for attacking taxonomically related ants. There may also be constraints in the host or the fungus preventing rapid co-evolution through changes to these molecules.

4.3 Utilization of host resources

Once inside the host, the pathogen requires other hydrolases for carbon assimilation. Efficient nutrient uptake from the host allows optimal proliferation of the fungus within its host and ultimately leads to insect death (Luo, Qin, Pei, & Keyhani, 2014). It has, therefore, been suggested that host resource utilization is crucial for host specificity (Gillespie, Bailey, Cobb, & Vilcinskas, 2000). Trehalases, in particular, probably play an important role in this respect. Indeed, the fly pathogen Entomophthora muscae (Entomophthorales) carries more trehalase-encoding genes in its genome than its close relative, the generalist Conidiobolus coronatus, which is a nonobligate pathogen (De Fine Licht, Jensen, & Eilenberg, 2017). We identified two trehalases with no positive selection signature as conserved across all species. Other glycoside hydrolases and lipases may be involved in breaking down primary carbon sources (Ortiz-Urquiza & Keyhani, 2013; Schrank & Vainstein, 2010). However, the evidence for positive selection is less robust for these enzymes. Thus, diversifying selection in ant-pathogenic Ophiocordyceps fungi probably acts less strongly on the function of carbon assimilation than on enterotoxins. Again, there may be constraints preventing the rapid evolution of host cuticle or fungal hydrolase and lipase functions.

Nitrogen also plays a key role in the proliferation of entomopathogenic fungi (Luo et al., 2014). However, our results suggest that initial nutrient acquisition via proteinases is not under strong diversifying selection. Genes evolving under positive selection were not enriched in functions related to the assimilation of nitrogen or amino acid synthesis.

In addition to carbon and nitrogen, iron uptake is also crucial for pathogen success (Bairwa, Hee Jung, & Kronstad, 2017; Haas, 2012; Sutak, Lesuisse, Tacherzy, & Richardson, 2008). The candidate neuromodulator genes found to be under positive selection included iron-related oxidoreductases. In particular, one of the proteins identified had a ferric-reductase transmembrane domain, and another was a flavodoxin oxidoreductase. Proteins with ferric-reductase transmembrane domains have been shown to be crucial for ferric iron uptake in yeast (Roman et al., 1993), whereas flavodoxin is an iron-free electron-transfer protein facilitating a range of metabolic reactions in the absence of iron. Specialist entomopathogens kill their hosts more slowly than generalists (Boomsma et al., 2014). In such a context, ant-specific Ophiocordyceps might be expected to have developed strategies for hijacking resources from the host. The efficient acquisition of iron and an ability to divert its use may be the key to outcompeting the host during infection.

4.4 The role of kinases and signal transduction

Kinase enzymes are widely recognized as participating in various cellular processes, crucial to growth and survival (Lee et al., 2016). The genes under positive selection in the ant-infecting Ophiocordyceps were enriched in kinase-related functions. Most were clearly related to signal transduction, which plays a crucial role in interactions between hosts and pathogens (Bahia, Satoskar, & Dussurget, 2018). Pathogens sense and respond to environmental stimuli, including the expression of virulence factor regulatory systems, in the hostile conditions of the host immune system. As extremely specialized pathogens, ant-infecting Ophiocordyceps fungi must fine-tune their responses in the host body.

4.5 Importance of lipid metabolism

Many entomopathogenic fungi are also thought to deploy a plethora of metabolites and toxins within the bodies of their hosts (Schrank & Vainstein, 2010; Singh, Son, & Lee, 2016). The nature of these molecules probably differs between groups of insect-pathogenic fungi and remains to be precisely determined, but the principal molecules include polyketides (PKs) and nonribosomal peptides (NRPs) (Gallo, Ferrara, & Perrone, 2013). We detected significant footprints of positive selection in some of the core genes of secondary metabolites. The most notable case concerned a PKS-like function involved in lipid biosynthesis. Lipids have been shown to be involved in pathophysiological processes in pathogenic fungi, but the role of the lipid signalling network in host-specific pathogenicity remains to be determined (Singh & Poeta, 2011). Kinases are also known to participate in lipid signalling pathways, and the kinases with significant footprints of positive selection identified included phosphatidylinositol 3 and 4-kinases. The phosphorylated form of phosphatidylinositol plays an important role in lipid and cell signalling (Funaki, Katagiri, Inukai, Kikuchi, & Asano, 2000). Lipid metabolism thus seemed to be subject to diversifying selection, although to a much lesser extent than heat-labile enterotoxins.

5 CONCLUSIONS

We focused on three ant-infecting species complexes from the genus Ophiocordyceps, including closely related species. Complex- and species-specific genes were found to be enriched in genes for heat-labile enterotoxins, and this gene family was found to be evolving under positive selection to a greater extent than other candidate gene families. Our results thus suggest that the specific adaptation and co-evolution of specialist species in the ant-infecting Ophiocordyceps fungi to their hosts is dependent on selection occurring within the body of the host rather than during cuticle penetration. By contrast, we detected little positive selection on lipases, proteases or chitinases, although we did identify a few interesting candidate genes from these groups. Comparative genomic studies of entomopathogenic fungi remain scarce, and the few studies that have been performed have focused exclusively on species of agricultural or medical interest. The findings of this study improve our understanding of the mechanisms of fungal adaptation to insect hosts, and future studies on fungal pathogens associated with other groups of insects should provide more general insight into the adaptation of entomopathogenic fungi and a more documented comparison with the mechanisms of adaptation in fungal pathogens of plants. The insect innate immune response seems to be much more specific than that in plants, suggesting a certain level of acquired immune response (Cooper & Eleftherianos, 2017). Fungal pathogens of insects would be expected to display extensive expansions and contractions of gene families, as observed in plant pathogens, but the target functions may be different. Additional insight gleaned from entomopathogenic fungi would help to improve our general understanding of the mechanisms of adaptive evolution in eukaryotes.

ACKNOWLEDGEMENTS

This work was supported by the Marie Sklodowska Curie Action No 655278 and Thailand Research Fund (TRF) Young Scientist Grant (TRG5780162) to NK. We would like to thank Alodie Snirc for advice concerning DNA extraction, Antoine Branca for suggestions about bioinformatic protocols, Rayan Chikhi for training in genome assembly, Jérome Collemare and Jorge C. Navarro-Muñoz for their guidance on using antiSMASH and BiG-SCAPE, and Suchada Mongkholsamrit and Kanoksri Tasanathai for the organization of sampling trips. We also would like to sincerely thank Clarissa de Bekker and David P. Hughes for kindly sharing their data on the candidate neuromodulators.

    AUTHOR CONTRIBUTION

    N.K., J.J.L. and T.G. designed the study. N.K. and N.A. conducted sampling and DNA extraction. N.K., D.W. and RCRSLV analysed sequencing and comparative genomic data. N.K. and T.G. wrote the manuscript, with contributions from all the authors.

    DATA ACCESSIBILITY

    The de novo assemblies of Ophiocordyceps camponoti-leonardi (NCBI Biosample SAMN07662903) and O. camponoti-saundersi (NCBI Biosample SAMN07662932) have been deposited with the NCBI as whole-genome shotgun (WGS) projects with Accession nos. PDHP00000000 and PDHQ00000000, respectively. The O. polyrhachis-furcata genome was updated based on the improved assembly from this study (LKCN00000000.2).

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.