Woolly apple aphid (WAA, Eriosoma lanigerum Hausmann) (Hemiptera: Aphididae) is a major pest of apple trees (Malus domestica, order Rosales) and is critical to the economics of the apple industry in most parts of the world. Here, we generated a chromosome-level genome assembly of WAA—representing the first genome sequence from the aphid subfamily Eriosomatinae—using a combination of 10X Genomics linked-reads and in vivo Hi-C data. The final genome assembly is 327 Mb, with 91% of the assembled sequences anchored into six chromosomes. The contig and scaffold N50 values are 158 kb and 71 Mb, respectively, and we predicted a total of 28,186 protein-coding genes. The assembly is highly complete, including 97% of conserved arthropod single-copy orthologues based on Benchmarking Universal Single-Copy Orthologs (busco) analysis. Phylogenomic analysis of WAA and nine previously published aphid genomes, spanning four aphid tribes and three subfamilies, reveals that the tribe Eriosomatini (represented by WAA) is recovered as a sister group to Aphidini + Macrosiphini (subfamily Aphidinae). We identified syntenic blocks of genes between our WAA assembly and the genomes of other aphid species and find that two WAA chromosomes (El5 and El6) map to the conserved Macrosiphini and Aphidini X chromosome. Our high-quality WAA genome assembly and annotation provides a valuable resource for research in a broad range of areas such as comparative and population genomics, insect–plant interactions and pest resistance management.

1 INTRODUCTION

There are ~5,000 known species of aphid (Hemiptera: Aphididae), and ~100 of these are of significant agricultural economic importance (Blackman & Eastop, 2017). While aphid genomics research has mostly focused on the subfamily Aphidinae (International Aphid Genomics Consortium, 2010; Li et al., 2019; Mathers, 2020; Mathers et al., 2017; Mathers, Mugford, et al., 2020; Mathers, Wouters, et al., 2020; Nicholson et al., 2015; Thorpe et al., 2018; Wenger et al., 2016), a large and widespread group including many important pests (Blackman & Eastop, 2000), investigation of genome evolution across the wider diversity of aphids has been limited (but see Julca et al., 2020). This report represents the first complete genome sequence for a species from the subfamily Eriosomatinae, which includes three potentially polyphyletic tribes (Eriosomatini, Fordini and Pemphigini), all of which are distantly related to members of Aphidinae (Li et al., 2014; Nováková et al., 2013; Ortiz-Rivas & Martínez-Torres, 2010).

Woolly apple aphid (WAA, Eriosoma lanigerum Hausmann, tribe Eriosomatini) is probably North American in origin. It was probably introduced to Britain in 1796 or 1797 with infested apple trees imported from America (Theobald, 1920) and has subsequently become a cosmopolitan and highly damaging pest of apple worldwide (Barbagallo et al., 1997). Up to 20 generations per year on apple have been reported. First-instar nymphs (crawlers) are dispersive and walk to new feeding sites, but later developmental stages tend to be more sedentary, forming colonies distinguishable based on their fluffy white protective wax coating (Barbagallo et al., 1997). WAA is able to feed on apple roots, trunks, branches and shoots. Saliva secreted whilst feeding causes cambium cells to divide rapidly, forming a gall which collapses and becomes pulpy under pressure from proliferating cells, making the area vulnerable to fungal infections (Barbagallo et al., 1997; Childs, 1929; Staniland, 1924). Edaphic (soil-dwelling) WAA has a significant negative effect on plant growth, especially of young, nonbearing apple trees where feeding significantly reduces trunk diameter (Brown et al., 1995) that is strongly correlated with fruit production (Waring, 1920). Such widespread and significant damage has made WAA resistance a key objective for rootstock breeding since the early 20th century (Cummins & Aldwinckle, 1983).

Sexual and asexual reproduction in aphid life cycles has significant impacts on population structure and allelic diversity (Delmotte et al., 2002; Dixon, 1977). Species in the subfamily Eriosomatinae typically go through phases of both sexual and parthenogenetic reproduction during the life cycle, but (unlike in other aphid subfamilies) sexual males and egg-laying females lack mouthparts and are therefore unable to feed. The life cycle and mode of reproduction of WAA is somewhat ambiguous. In North America, where WAA populations have been reported to induce leaf galls on American elm (Ulmus americana L.) (Baker, 1915), it has been suggested that WAA has a heteroecious life cycle, alternating between apple (parthenogenetic reproduction) and elm (sexual reproduction). However, it is not clear whether genotypes found on elm are also capable of feeding on apple (Blackman & Eastop, 2000). In other parts of the world, WAA is assumed to be entirely anholocyclic (asexual) on apple (e.g., Dumbleton & Jeffreys, 1938; Eastop, 1966). Sexual males and females have sometimes been observed on apple, but the deposited eggs usually do not hatch, and such populations are assumed to be functionally asexual (Blackman & Eastop, 2000).

The genetic structure of WAA populations also has important practical relevance in the context of host-plant resistance. Four genes associated with WAA resistance in apple (Er1–4) have been identified from various sources (Bus et al., 2010, 2008, 2000; King et al., 1991). Some genotypes of WAA have been observed to feed on rootstocks with Er1-, Er2- and Er3-mediated resistance (Cummins & Aldwinckle, 1983; Rock & Zeiger, 1974; Sandanayaka et al., 2003). The prevalence and spread of such resistance-breaking genotypes within WAA populations has not yet been investigated, and the availability of a full genome sequence will benefit such studies considerably.

Here, we generated a high-quality chromosome-level genome assembly of WAA using a combination of 10X Genomics linked-reads and in vivo chromatin conformation capture (Hi-C) sequencing. Subsequently, gene prediction, functional annotation and phylogenetic analysis were carried out to determine the relationship of WAA within the superfamily Aphidoidea. Our reference genome can provide information about genome organization in the subfamily Eriosomatinae and allows comparative genomic studies for a better understanding of the evolution of aphids.

2 MATERIALS AND METHODS

2.1 Sampling

Aphids were collected from a population feeding on apple trees grown under glass at NIAB EMR. The glasshouse-grown plants were deliberately infested using WAA collected from local orchards. A single colony infesting a glasshouse-grown potted apple plant (a clonally propagated aphid-susceptible breeding line in the rootstock breeding programme at NIAB EMR) was sampled for aphids in November 2018. All sampled aphids were collected from one (12-cm) section of infested woody stem. Wax filaments covering sampled insects were removed using a fine paint brush and the aphids placed in Eppendorf tubes. For genome analysis, we collected individual adults (20 in total) into separate tubes and also generated a pooled sample containing mixed life stages (25 individuals) in a single tube. An additional two samples of grouped aphids (25 in total) were pooled into individual tubes for RNA sequencing (RNA-seq) analysis. These groups consisted of: apterous (wingless) adults; and mid-instar nymphs (mix of second, third and fourth instars). Collected aphids were snap frozen by immersing tubes in liquid nitrogen.

2.2 DNA extraction and sequencing

Total genomic DNA was extracted from a single aphid using an Illustra Nucleon Phytopure kit, with a modification of the manufacturer's protocol (GE Healthcare). The lysis step was performed adding 10 µl of Proteinase K and incubating the sample in a water bath for 2 hr at 55°C. The DNA precipitation was performed using NaAc (3 m) together with isopropanol to increase the DNA yield. DNA quality and quantity were assessed using a Nanodrop spectrophotometer (Thermo Scientific), a Qubit double-stranded DNA BR Assay Kit (Invitrogen, Thermo Fisher Scientific) and Femto fragment analyser (Agilent). 10X Genomics library preparation and Illumina genome sequencing (HiSeq X, 150 bp paired-end) were performed by Novogene Bioinformatics Technology in accordance with standard protocols.

A pool of mixed-stage samples was used to construct a Hi-C chromatin contact map to enable a chromosome-level assembly. Dovetail Genomics created the Hi-C library with the DpnII restriction enzyme following a similar protocol to Lieberman-Aiden et al. (2009). The Hi-C library was sequenced on an Illumina HiSeq X sequencer and 150-bp paired-end reads were generated.

2.3 Genome assembly

To create the de novo assembly, the 10X Genomics linked-read data were assembled using supernova 2.1.1 (Weisenfeld et al., 2017) with the default parameters and the recommended number of reads (--maxreads = 199222871) to produce the pseudohaplotype assembly output (--style = pseudohap). We improved the initial supernova assembly by performing iterative scaffolding using all of the 10X Genomics raw data (364 million reads; Table S1) following the procedure set out in Mathers, Wouters, et al. (2020). Briefly, we performed two rounds of Scaff10x (https://github.com/wtsi-hpag/Scaff10X) with the parameters “-longread 1 -edge 50000 - block 50000,” followed by misassembly detection and correction with tigmint (Jackman et al., 2018). These steps were followed by a final round of scaffolding with Assembly Round-up by Chromium Scaffolding (arcs) (Yeo et al., 2018). We aligned the Hi-C reads to the 10X Genomics assembly using the juicer pipeline (Durand et al., 2016). The assembly was then scaffolded with Hi-C data (Table S1) into chromosome-level organization using the 3d-dna pipeline (Dudchenko et al., 2017), followed by manual correction using Juicebox Assembly Tools (jbat) (Dudchenko et al., 2018). The assembly was polished after jbat review using the 3d-dna seal module to reintegrate genomic content removed from super-scaffolds through false-positive manual editing to create a final scaffolded assembly.

We checked the Hi-C assembly for contamination using the blobtools pipeline version 0.9.19 (Kumar et al., 2013; Laetsch & Blaxter, 2017) by generating taxon annotated GC content-coverage plots (known as “BlobPlots”). Each scaffold was annotated with taxonomy information based on blastn (Basic Local Alignment Search Tool) version 2.2.31 (Camacho et al., 2009) searches against the National Center for Biotechnology Information (NCBI) nucleotide database (nt, downloaded October 13, 2017) with the options “-outfmt '6 qseqid staxids bitscore std sscinames sskingdoms stitle' -culling_limit 5 -evalue 1e-25.” To calculate average coverage per scaffold, we mapped the 10X Genomics raw reads, after barcode removal using proc10xg (https://github.com/ucdavis-bioinformatics/proc10xG), to the assembly using bwa-mem version 0.7.7 (Burrow-Wheeler Aligner) (Li, 2013) with default parameters. The resulting BAM file was sorted with samtools version 1.3 (Li et al., 2009) and passed to blobtools along with the table of blastn results. The mitochondrial genome was identified and removed based on alignment to the WAA mitochondrial genome (NCBI accession no. NC_033352.1) with nucmer version 4.0.0.beta2 (Marçais et al., 2018), and patterns of coverage and GC content obtained from blobtools. A frozen release was generated for the final assembly with scaffolds renamed and ordered by size with seqkit version 0.9.1 (Shen et al., 2016). We assessed the quality of the genome assembly by searching for conserved, single copy, arthropod genes (n = 1,066) with Benchmarking Universal Single-Copy Orthologs (busco) version 3.0 (Waterhouse et al., 2018) and by analysis of k-mer spectra with kat version 2.3.2 (K-mer Analysis Toolkit) (Mapleson et al., 2017) using the default k-mer size (k = 27). To generate a k-mer spectrum we compared k-mer content of the raw sequencing reads to the k-mer content of the assembly using kat comp. Using this spectrum we also estimated the WAA genome size, heterozygosity level and genome assembly completeness.

2.4 RNA extraction and sequencing

Total RNA was extracted from two pools of ~25 aphids, one containing adults and one containing midinstar nymphs, collected from the same colony. The sample was ground under liquid nitrogen in a 1.5-ml Eppendorf tube using a plastic pestle. RNA was extracted using Trizol (Sigma) according to the manufacturer's protocol. RNA was further purified using RNeasy with on-column DNAse treatment (Qiagen) according to the manufacturer's protocol and eluted in 100 ml of nuclease-free water. RNA quality was assessed by electrophoresis of 5 μl denatured in formamide on a 1% agarose gel. Purity was assessed using a Nanodrop spectrophotometer (ThermoFisher) to measure the A₂₆₀/A₂₈₀ and A₂₆₀/A₂₃₀ ratios. Concentration of RNA was measured, and the presence of contaminating DNA was assessed using a Qubit (Lifetech).

Quality control and trimming for adapters and low-quality bases (quality score <30) of the RNA-seq raw reads were performed using fastqc version 0.11.8 (Andrews, 2010) and trim_galore version 0.5.0 (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore) respectively.

2.5 Gene prediction

Before running the gene prediction, we identified repeats and transposable elements (TEs) with repeatmasker version 4.0.7 (Tarailo-Graovac & Chen, 2009) using the RepBase Insecta repeat library (Bao et al., 2015) with the parameters “-e ncbi -species insecta -a -xsmall -gff” (Jurka et al., 2005).

We mapped the quality- and adapter-trimmed RNA-seq reads from the two pools of adults and midinstar nymphs (Table S2) to the soft-masked assembly with hisat2 version 2.0.5 (Kim et al., 2015) with the following parameters: “--max-intronlen 25000 --dta-cufflinks” followed by sorting and indexing with samtools version 1.3 (Li et al., 2009). Strand-specific RNA-seq alignments were split by forward and reverse strands and passed to braker2 as separate BAM files. Therefore, we ran braker2 version 2.1.2 (Hoff et al., 2016, 2019) to train augustus (Lomsadze et al., 2014; Stanke et al., 2008) and predict protein-coding genes, incorporating evidence from the RNA-seq alignments and alignment of busco genes with the following parameters “--softmasking --gff3 --prg=gth --gth2traingenes.” After gene prediction, completeness of the gene set was checked with busco using the longest transcript of each gene as the representative transcript.

2.6 Functional annotation

All the unique transcripts were converted to peptide sequence using cufflinks version 2.2.1 (Trapnell et al., 2010). Sequences were searched against the nonredundant NCBI protein database using blastp version 2.6.0 with an E-value cut-off of ≤1 × 10⁻⁵. blast2go version 5.0 (Conesa et al., 2005) and interproscan version 2.5.0 (Quevillon et al., 2005) were used to assign Gene Ontology (GO) terms. Protein domains were annotated by searching against the InterPro version 32.0 (Hunter et al., 2012) and Pfam version 27.0 (Punta et al., 2012) databases, using interproscan version 2.5.0 (Quevillon et al., 2005) and hmmer version 3.1 (Finn et al., 2011), respectively. The pathways in which the genes might be involved were assigned by protein–protein blast against the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases (release 53), with an E-value cut-off of 1 × 10⁻⁵.

2.7 Phylogeny and comparative genomics

Orthologous groups in Aphididae genomes were identified from the predicted protein sequences of WAA and nine other aphid genomes already published (Table S3): Myzus cerasi (Mathers, Mugford, et al., 2020; Thorpe et al., 2018), Myzus persicae (Mathers, Wouters, et al., 2020), Diuraphis noxia (Nicholson et al., 2015), Acyrthosiphon pisum (Mathers, Wouters, et al., 2020), Pentalonia nigronervosa (Mathers, Mugford, et al., 2020), Aphis glycines (Mathers, 2020), Rhopalosiphum maidis (Chen et al., 2019), Rhopalosiphum padi (Thorpe et al., 2018) and Cinara cedri (Julca et al., 2020). As an outgroup, we included the genome of the silverleaf whitefly Bemisia tabaci (Chen et al., 2016). We used the longest transcript to represent the gene model when several transcripts of a gene were annotated. orthofinder version 2.2.3 (Emms & Kelly, 2015, 2019) with diamond version 0.9.14 (Buchfink et al., 2015), Multiple Alignment using Fast Fourier Transform (mafft) version 7.305 (Katoh & Standley, 2013) and fasttree version 2.1.7 (Price, Dehal, & Arkin, 2009, 2010) were used to cluster proteins into orthogroups, reconstruct gene trees and estimate the species tree. The orthofinder species tree was automatically rooted by orthofinder based on informative gene duplications with stride (Emms & Kelly, 2017).

GO analysis of lineage-specific gene families and genes that have undergone lineage-specific duplication was carried out using the bioconductor (Gentleman et al., 2004) package topGO (Alexa & Rahnenführer, 2009). We used a Fisher exact test to identify overrepresented GO terms.

2.8 Synteny analysis

Syntenic blocks of genes were identified between the chromosome-level genome assemblies of WAA, M. persicae, A. pisum and R. maidis (see Table S3 for details of assembly and annotation versions used) using mcscanx version 1.1 (Wang et al., 2012). For each comparison, we carried out an all versus all blast search of annotated protein sequences using blastall version 2.2.22 (Altschul et al., 1990) with the options “-p blastp - e 1e-10 -b 5 -v 5 -m 8” and ran mcscanx with the parameters “-s 5 -b 2,” requiring synteny blocks to contain at least five consecutive genes and to have a gap of no more than 25 genes. mcscanx results were visualized with synvisio (https://synvisio.github.io/#/).

3 RESULTS AND DISCUSSION

3.1 Genome sequencing and assembly

We generated a high-quality chromosome-level genome assembly of WAA using a combination of 10X Genomics linked-reads and in vivo Hi-C data (Figure 1a). In total, we generated 54.72 Gb of 10X Genomics linked-reads and 71.95 Gb of Hi-C reads, corresponding to 167× and 220× coverage of the WAA genome, respectively. Initial de novo assembly of the 10X Genomics linked-reads with supernova produced a contiguous assembly totalling 330 Mb (Table 1; scaffold N50 = 4.16 Mb). We further improved the supernova assembly by iterative scaffolding and misjoin correction with scaff10x (two rounds) and tigmint (one round), increasing the scaffold N50 of the assembly to 4.22 Mb and reducing the number of scaffolds from 8,967 to 7,929, with the longest scaffold spanning 12.58 Mb (Table 1). To generate chromosome length super-scaffolds, we scaffolded the 10X assembly using in vivo Hi-C data. After manual curation, the final assembly comprised 327 Mb, with 91% of the assembly anchored into six chromosomes (Figure 1a), consistent with the WAA 2n karyotype being 12 chromosomes (Gautam & Verma, 1982, 1983; Kulkarni, 1984; Robinson & Chen, 1969). The lengths of the six chromosomes ranged from 29.68 to 71.23 Mb.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Chromosome-level genome assembly of the woolly apple aphid (WAA). (a) Heatmap showing the frequency of Hi-C contacts along the WAA genome assembly. Blue lines indicate super scaffolds and green lines show contigs. Genome scaffolds are ordered from longest to shortest with the x- and y-axis showing cumulative length in millions of base pairs (Mb). (b) KAT k-mer plots comparing k-mer content of 10X Genomics raw reads (barcodes removed) with Hi-C assembly. The black area of the graphs represents the distribution of k-mers present in the reads but not in the assembly and the red area represents the distribution of k-mers present in the reads and once in the assembly. Other colours show k-mers found multiple times in the genome assembly. (c) Genome assembly completeness assessed by the recovery of universal single-copy genes (BUSCOs) using the Arthropoda gene set (n = 1,066). busco assessment result for *Eriosoma lanigerum* (in bold) compared with aphid genomes available from the National Center for Biotechnology Information (NCBI). The species are coloured by aphid tribe (see Figure 2) [Colour figure can be viewed at wileyonlinelibrary.com]

Table 1. Genome summary statistics for each step of the assembly of WWA

Assembly	SP	SP + SC + TG	SP + SC + TG + HC
Base pairs (Mb)	330	330	327
Number of contigs	12,566	12,703	12,065
Contig N50 (Mb)	0.158	0.158	0.158
Number of scaffolds	8,967	7,929	7,146
Scaffold N50 (Mb)	4.164	4.222	62.861
Longest scaffold (Mb)	16.081	12.584	71.231
Percentage of assembly in chromosome length scaffolds	0	0	91

Abbreviations: HC, Hi-C; SC, scaff10x; SP, supernova; TG, tigmint.

The WAA genome assembly is accurate, complete and free from contamination. Our k-mer analysis comparing genomic content of the 10X reads (after barcode removal) with the WAA genome assembly reveals little missing single-copy genome content and very low levels of duplicated content caused by the assembly of haplotigs (Figure 1b). Furthermore, our 327-Mb WAA genome assembly is close to the genome size estimate based on k-mer analysis with kat (363 Mb) (Figure 1b; Table S4) and the genome size of Eriosoma americanum (330 Mb) (Finston et al., 1995). This analysis is further supported by high representation of conserved arthropod genes (n = 1,066) in the assembly, with 97% (n = 1,032) found as complete single copies. Indeed, the WAA assembly contains the highest number of conserved single-copy Arthropoda genes of any published aphid genome (Figure 1c). A taxon-annotated GC content-coverage plot (known as a “BlobPlot”; Kumar et al., 2013) revealed the co-assembly of the obligate aphid bacterial symbiont Buchnera aphidicola (Baumann et al., 1995; Douglas, 1998; Hansen & Moran, 2011; Shigenobu & Wilson, 2011) and a secondary symbiont, Serratia symbiotica (Burke & Moran, 2011; Manzano-Marín & Latorre, 2016; Moran et al., 2005) (Figure S1). These bacterial scaffolds were filtered from the final assembly, along with scaffolds showing atypical GC content and read coverage, leaving the final assembly free from obvious contamination (Figure S2). The B. aphidicola genome was assembled into 99 scaffolds 444 kb in length (Table S5). The S. symbiotica genome was fragmented and incomplete (165 kb total length, 30 scaffolds, N50 = 9 kb).

3.2 Genome annotation

We generated 44 Gb of strand-specific RNA-seq data to aid genome annotation. A total of 28,186 protein-coding genes (28,297 transcripts) were predicted in the WAA genome assembly, of which 83.8% (23,627 genes) were located on the six chromosomes (Table S6). Mapping of our RNA-seq data to the annotation revealed a relatively low number of expressed genes: 8,220 (29.1%) gene models were supported by an estimated count of at least 10 reads and 4,576 (16.2%) had estimated expression of least 1 transcript per million (TPM) (Table S7). This is probably due to degradation of our RNA samples. Nonetheless, our annotation contains 97.3% of the busco Arthropoda gene set as complete copies (95.6% complete and single copy; Figure S3) and 18,835 genes (66.6%) have an orthologue in at least one another sequenced aphid species (Figure 2). Additionally, 55.7% (15,477) of the predicted transcripts were functionally annotated with at least one GO term and/or protein domain. Taken together, these analyses indicate that our gene set is complete and accurate. In the future, WAA gene models will be further refined with additional RNA-seq data sets and community-led manual curation.

3.3 Phylogeny and comparative genomics

WAA is the first member of the aphid subfamily Eriosomatinae to have its genome sequenced. To place this new genome assembly in a phylogenetic context and to investigate gene family evolution across aphids, we compared the WAA proteome (the complete set of annotated protein-coding genes) to the proteomes of nine other aphid species that have fully sequenced genomes (Table S3) and to the whitefly, Bemisia tabaci MEAM1 (Chen et al., 2016). In total, we clustered 240,702 proteins into 23,294 orthogroups (gene families) and 26,969 singleton genes (Table S8). Maximum likelihood phylogenetic analysis based on a concatenated alignment of 3,079 conserved single-copy genes produced a fully resolved species tree with 100% support at all nodes (Figure 2). Eriosomatini (represented by WAA) is recovered as a sister group to Aphidini + Macrosiphini (Aphidinae), with Lachnini (represented by C. cedri) placed as an outgroup to all other sequenced aphid species (Figure 2).

The number of predicted genes in WAA (28,186) is within the range of other aphid genomes (16,992–31,001) but among the highest (Figure 2). However, predicted gene numbers can vary depending on both the quality of the genome assembly and the different pipelines used for predicting genes (Denton et al., 2014; Yandell & Ence, 2012). Of the 28,186 predicted genes in the WAA genome, 18,738 (66%) have an orthologue in at least one other aphid species and 13,278 (47%) are conserved in the majority (at least 9/10) of aphid species (Figure 2). The high number of genes with orthologues in other aphid species despite the absence of close WAA relatives in our analysis probably reflects the early emergence of many aphid gene families (Julca et al., 2020).

Aphid genomes are also subject to high levels of ongoing gene duplication (Fernández et al., 2020; International Aphid Genomics Consortium, 2010; Julca et al., 2020; Mathers et al., 2017; Thorpe et al., 2018). WAA is no exception and we detect a large number of lineage-specific gene families (689 orthogroups corresponding to 3,954 genes; Figure 2) and identify 9,936 genes that have undergone lineage-specific duplication. These genes are enriched for a diverse set of functions including sensory perception and metabolic process (Tables S9 and S10). As additional Eriosomatini genomes become available, the diversification of these genes and gene families will be investigated in greater detail.

3.4 X chromosome fragmentation in WAA

It has previously been shown that the autosomes of aphids within the tribes Macrosiphini and Aphidini have undergone extensive rearrangement over the last ~30 million years while the aphid sex (X) chromosome (which is haploid in males) has been conserved (Li et al., 2020; Mathers, Wouters, et al., 2020). Given that we now have a chromosome-level genome assembly of an aphid from a third, more divergent tribe, we used our WAA aphid assembly to investigate the evolution of aphid genome structure. We identified syntenic blocks of genes between our WAA assembly and the genomes of Myzus persicae (Figure 3a) and Acyrthosiphon pisum (Figure 3b) from Macrosiphini (Mathers, Wouters, et al., 2020), and Rhopalosiphum maidis (Figure 3c) from Aphidini (Chen et al., 2019). All comparisons reveal high levels of genome rearrangement between WAA chromosomes 1–4 (EL1–EL4) and the autosomes of M. persicae, A. pisum and R. maidis. Surprisingly, however, we find that two WAA chromosomes (EL5 and EL6) map to the conserved Macrosiphini and Aphidini X chromosome, suggesting either fragmentation of the X chromosome in WAA or that the large Aphidinae (Macrosiphini + Aphidini) X chromosome was the result of an ancient fusion event. Additional chromosome-level assemblies of diverse aphid species will be required to test these two competing hypotheses. However, the lack of rearrangements between either EL5 or EL6 and the autosomes suggests that recent X chromosome fragmentation in WAA is more likely. Additionally, it remains to be determined if either EL5 or EL6 (or both) behave as an X chromosome in WAA (i.e., they are found in a haploid state in males). Although populations such as the UK one used here for genome sequencing are thought to be entirely anholocyclic (asexual) on apple (Dumbleton & Jeffreys, 1938; Eastop, 1966), sexual males and females have been observed (Blackman & Eastop, 2000), suggesting the presence of a functional X chromosome. In the future, whole genome resequencing of these rare males may be used to confirm the identity of the WAA X chromosome based on patterns of sequence coverage, as has been carried out for M. persicae and A. pisum (Li et al., 2019; Mathers, Wouters, et al., 2020).

4 CONCLUSIONS

WAA is a widespread pest of apple trees that is particularly critical to the economics of the apple industry in most parts of the world. This study provides the first chromosome-level genome of WAA and the genome sequencing of the first representative of the whole subfamily Eriosomatinae. The WAA genome will be useful as a reference for investigation of genetic differences among wild WAA populations. The high quality of the genome will also allow molecular marker development and detection using large-scale genome resequencing. Population resequencing data can be used to investigate genomic regions and genes that show genetic variability and to analyse demographic history events in WAA populations.

Finally, as the only genome available for the subfamily Eriosamatinae and as an additional outgroup to other sequenced aphids from the subfamily Aphidinae, the WAA genome will allow more extensive comparative genomics analysis of aphids.

ACKNOWLEDGEMENTS

T.C.M. is funded by a BBSRC Future Leader Fellowship (BB/R01227X/1). Additional support was received from the BBSRC Institute Strategy Programme (BB/P012574/1) and the John Innes Foundation. C.J.G. acknowledges receipt of a studentship funded by the BBSRC, AHDB and an industry consortium (http://www.ctp-fcr.org/industry-partners/).

AUTHOR CONTRIBUTIONS

G.P., S.A.H. and T.C.M. conceived the study. G.P., F.F.F. and C.J.G. provided samples. R.B. and S.T.M. extracted DNA and RNA. R.B., A.S. and T.C.M. performed the genome assembly, gene model prediction, gene annotation and comparative analyses. R.B., C.J.G., G.P., S.A.H. and T.C.M. wrote the manuscript with input from all authors. All authors reviewed the manuscript.

Open Research

DATA AVAILABILITY STATEMENT

The genome raw reads, the RNA sequencing data and the genome assembly are available at the National Center for Biotechnology Information (NCBI) with the BioProject no. PRJNA623270. The genome assembly and annotation and orthogroup clustering results are available for download from Zenodo (https://doi.org/10.5281/zenodo.3797131). The WAA genome assembly and annotation are also available from AphidBase (AphidBase (https://bipaa.genouest.org/sp/eriosoma_lanigerum/).

Supporting Information

REFERENCES

Alexa, A., & Rahnenführer, J. (2009). Gene set enrichment analysis with topGO. Retrieved from https://bioconductor.riken.jp/packages/3.2/bioc/vignettes/topGO/inst/doc/topGO.pdf
Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
10.1016/S0022-2836(05)80360-2
CAS PubMed Web of Science® Google Scholar
Andrews, S. (2010). FastQC: A quality control tool for high throughput sequence data. Retrieved from http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Google Scholar
Baker, A. C. (1915). The woolly apple aphis. U.S. Dep. Agric. Rep. 101.
Google Scholar
Bao, W., Kojima, K. K., & Kohany, O. (2015). Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA, 6(1), 11. https://doi.org/10.1186/s13100-015-0041-9
10.1186/s13100-015-0041-9
PubMed Web of Science® Google Scholar
Barbagallo, S., Cravedi, P., Pasqualini, E., Patti, I., & Stroyan, H. L. (1997). Aphids of the principal fruit-bearing crops. Milano Bayer, pp. 117.
Google Scholar
Baumann, P., Baumann, L., Lai, C.-Y., Rouhbakhsh, D., Moran, N. A., & Clark, M. A. (1995). Genetics, physiology, and evolutionary relationships of the genus Buchnera: Intracellular symbionts of aphids. Annual Review of Microbiology, 49(1), 55–94. https://doi.org/10.1146/annurev.mi.49.100195.000415
10.1146/annurev.mi.49.100195.000415
CAS PubMed Web of Science® Google Scholar
Blackman, R. L., & Eastop, V. F. (2000). Aphids on the world’s crops: An identification and information guide. John Wiley & Sons Ltd.
Google Scholar
Blackman, R. L., & Eastop, V. F. (2017). Taxonomic issues. In H. F. Emden, & R. Harrington (Eds.), Aphids as crop pests ( 2nd ed.). CAB International.
10.1079/9781780647098.0001
Google Scholar
Brown, M., Schmitt, J., Ranger, S., & Hogmire, H. (1995). Yield reduction in apple by edaphic woolly apple aphid (Holnoptera: Aphididae) populations. Journal of Economic Entomology, 88(1), 127–133. https://doi.org/10.1093/jee/88.1.127
10.1093/jee/88.1.127
Web of Science® Google Scholar
Buchfink, B., Xie, C., & Huson, D. H. (2015). Fast and sensitive protein alignment using DIAMOND. Nature Methods, 12(1), 59. https://doi.org/10.1038/nmeth.3176
10.1038/nmeth.3176
CAS PubMed Web of Science® Google Scholar
Burke, G. R., & Moran, N. A. (2011). Massive genomic decay in Serratia symbiotica, a recently evolved symbiont of Aphids. Genome Biology and Evolution, 3, 195–208. https://doi.org/10.1093/gbe/evr002
10.1093/gbe/evr002
CAS PubMed Web of Science® Google Scholar
Bus, V., Ranatunga, C., Gardiner, S., Bassett, H., & Rikkerink, E. (2000). Marker assisted selection for pest and disease resistance in the New Zealand apple breeding programme. Acta Horticulturae, 538, 541–547. https://doi.org/10.17660/ActaHortic.2000.538.95
10.17660/ActaHortic.2000.538.95
Google Scholar
Bus, V. G. M, Chagné, D., Bassett, H. C. M., Bowatte, D., Calenge, F., Celton, J.-M., Durel, C.-E., Malone, M. T., Patocchi, A., Ranatunga, A. C., Rikkerink, E. H. A., Tustin, D. S., Zhou, J., & Gardiner, S. E. (2008). Genome mapping of three major resistance genes to woolly apple aphid (Eriosoma lanigerum Hausm.). Tree Genetics and Genomes, 4(2), 223–236. https://doi.org/10.1007/s11295-007-0103-3
10.1007/s11295-007-0103-3
Web of Science® Google Scholar
Bus, V. G. M., Bassett, H. C. M., Bowatte, D., Chagné, D., Ranatunga, C. A., Ulluwishewa, D., Wiedow, C., & Gardiner, S. E. (2010). Genome mapping of an apple scab, a powdery mildew and a woolly apple aphid resistance gene from open-pollinated Mildew Immune Selection. Tree Genetics and Genomes, 6(3), 477–487. https://doi.org/10.1007/s11295-009-0265-2
10.1007/s11295-009-0265-2
Web of Science® Google Scholar
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., & Madden, T. L. (2009). BLAST+: Architecture and applications. BMC Bioinformatics, 10(1), 421. https://doi.org/10.1186/1471-2105-10-421
10.1186/1471-2105-10-421
CAS PubMed Web of Science® Google Scholar
Chen, W., Hasegawa, D. K., Kaur, N., Kliot, A., Pinheiro, P. V., Luan, J., Stensmyr, M. C., Zheng, Y., Liu, W., Sun, H., Xu, Y., Luo, Y., Kruse, A., Yang, X., Kontsedalov, S., Lebedev, G., Fisher, T. W., Nelson, D. R., Hunter, W. B., … Fei, Z. (2016). The draft genome of whitefly Bemisia tabaci MEAM1, a global crop pest, provides novel insights into virus transmission, host adaptation, and insecticide resistance. BMC Biology, 14(1), 110. https://doi.org/10.1186/s12915-016-0321-y
10.1186/s12915-016-0321-y
PubMed Web of Science® Google Scholar
Chen, W., Shakir, S., Bigham, M., Richter, A., Fei, Z., & Jander, G. (2019). Genome sequence of the corn leaf aphid (Rhopalosiphum maidis Fitch). GigaScience, 8(4), giz033. https://doi.org/10.1093/gigascience/giz033
10.1093/gigascience/giz033
PubMed Web of Science® Google Scholar
Childs, L. (1929). The relation of woolly apple aphis to perennial canker infection with other notes on the disease. Station Bulletin, 243. Agricultural Experiment Station Oregon State Agricultural College.
Google Scholar
Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M., & Robles, M. (2005). Blast2GO: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics, 21(18), 3674–3676. https://doi.org/10.1093/bioinformatics/bti610
10.1093/bioinformatics/bti610
CAS PubMed Web of Science® Google Scholar
Cummins, J. N., & Aldwinckle, H. S. (1983). Breeding apple rootstocks. In J. Janick (Ed.), Plant breeding reviews (pp. 294–394). Boston, MA: Springer.
10.1007/978-1-4684-8896-8_10
Google Scholar
Delmotte, F., Leterme, N., Gauthier, J., Rispe, C., & Simon, J. (2002). Genetic architecture of sexual and asexual populations of the aphid Rhopalosiphum padi based on allozyme and microsatellite markers. Molecular Ecology, 11(4), 711–723. https://doi.org/10.1046/j.1365-294X.2002.01478.x
10.1046/j.1365-294X.2002.01478.x
CAS PubMed Web of Science® Google Scholar
Denton, J. F., Lugo-Martinez, J., Tucker, A. E., Schrider, D. R., Warren, W. C., & Hahn, M. W. (2014). Extensive error in the number of genes inferred from draft genome assemblies. PLoS Computational Biology, 10(12), e1003998. https://doi.org/10.1371/journal.pcbi.1003998
10.1371/journal.pcbi.1003998
PubMed Web of Science® Google Scholar
Dixon, A. F. G. (1977). Aphid ecology: Life cycles, polymorphism, and population regulation. Annual Review of Ecology, Evolution and Systematics, 8, 329–353. https://doi.org/10.1146/annurev.es.08.110177.001553
10.1146/annurev.es.08.110177.001553
Web of Science® Google Scholar
Douglas, A. (1998). Nutritional interactions in insect-microbial symbioses: Aphids and their symbiotic bacteria Buchnera. Annual Review of Entomology, 43(1), 17–37.
10.1146/annurev.ento.43.1.17
CAS PubMed Web of Science® Google Scholar
Dudchenko, O., Batra, S. S., Omer, A. D., Nyquist, S. K., Hoeger, M., Durand, N. C., Shamim, M. S., Machol, I., Lander, E. S., Aiden, A. P., & Aiden, E. L. (2017). De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science, 356(6333), 92–95.
10.1126/science.aal3327
CAS PubMed Web of Science® Google Scholar
Dudchenko, O., Shamim, M. S., Batra, S., Durand, N. C., Musial, N. T., Mostofa, R., Pham, M., St Hilaire, B. G., Yao, W., Stamenova, E., Hoeger, M., Nyquist, S. K., Korchina, V., Pletch, K., Flanagan, J. P., Tomaszewicz, A., McAloose, D., Pérez Estrada, C., Novak, B. J., … Aiden, E. L. (2018). The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000. BioRxiv, 254797.
Google Scholar
Dumbleton, L. J., & Jeffreys, F. (1938). The control of the woolly aphis by Aphelinus mali. Department of Scientific and Industrial Research.
Google Scholar
Durand, N. C., Shamim, M. S., Machol, I., Rao, S. S., Huntley, M. H., Lander, E. S., & Aiden, E. L. (2016). Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Systems, 3(1), 95–98. https://doi.org/10.1016/j.cels.2016.07.002
10.1016/j.cels.2016.07.002
CAS PubMed Web of Science® Google Scholar
Eastop, V. F. (1966). A taxonomic study of Australian Aphidoidea (Homoptera). Australian Journal of Zoology, 14(3), 399–592. https://doi.org/10.1071/ZO9660399
10.1071/ZO9660399
Google Scholar
Emms, D. M., & Kelly, S. (2015). OrthoFinder: Solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biology, 16(1), 157. https://doi.org/10.1186/s13059-015-0721-2
10.1186/s13059-015-0721-2
PubMed Web of Science® Google Scholar
Emms, D. M., & Kelly, S. (2017). STRIDE: Species tree root inference from gene duplication events. Molecular Biology and Evolution, 34(12), 3267–3278. https://doi.org/10.1093/molbev/msx259
10.1093/molbev/msx259
CAS PubMed Web of Science® Google Scholar
Emms, D. M., & Kelly, S. (2019). OrthoFinder: Phylogenetic orthology inference for comparative genomics. Genome Biology, 20(1), 1–14. https://doi.org/10.1186/s13059-019-1832-y
10.1186/s13059-019-1832-y
PubMed Web of Science® Google Scholar
Fernández, R., Marcet-Houben, M., Legeai, F., Richard, G., Robin, S., Wucher, V., Pegueroles, C., Gabaldón, T., & Tagu, D. (2020). Selection following gene duplication shapes recent genome evolution in the pea aphid Acyrthosiphon pisum. Molecular Biology and Evolution, 37(9), 2601–2615. https://doi.org/10.1093/molbev/msaa110
10.1093/molbev/msaa110
CAS PubMed Web of Science® Google Scholar
Finn, R. D., Clements, J., & Eddy, S. R. (2011). HMMER web server: Interactive sequence similarity searching. Nucleic Acids Research, 39(suppl_2), W29–W37. https://doi.org/10.1093/nar/gkr367
10.1093/nar/gkr367
CAS PubMed Web of Science® Google Scholar
Finston, T. L., Hebert, P. D., & Foottit, R. B. (1995). Genome size variation in aphids. Insect Biochemistry and Molecular Biology, 25(2), 189–196. https://doi.org/10.1016/0965-1748(94)00050-R
10.1016/0965-1748(94)00050-R
CAS Web of Science® Google Scholar
Gautam, D., & Verma, L. (1982). Karyotype analysis and mitotic cycle of woolly apple aphid (Eriosoma lanigerum Hausmann). Entomon, 7, 167–171.
Web of Science® Google Scholar
Gautam, D., & Verma, L. (1983). Chromosome numbers in different morphs of the woolly apple aphid, Eriosoma lanigerum Hausmann. Chromosome Information Service, 35, 9–10.
Google Scholar
Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A. J., … Zhang, J. (2004). Bioconductor: Open software development for computational biology and bioinformatics. Genome Biology, 5(10), R80.
10.1186/gb-2004-5-10-r80
PubMed Web of Science® Google Scholar
Hansen, A. K., & Moran, N. A. (2011). Aphid genome expression reveals host–symbiont cooperation in the production of amino acids. Proceedings of the National Academy of Sciences of the United States of America, 108(7), 2849–2854. https://doi.org/10.1073/pnas.1013465108
10.1073/pnas.1013465108
CAS PubMed Web of Science® Google Scholar
Hoff, K. J., Lange, S., Lomsadze, A., Borodovsky, M., & Stanke, M. (2016). BRAKER1: Unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics, 32(5), 767–769.
10.1093/bioinformatics/btv661
CAS PubMed Web of Science® Google Scholar
Hoff, K. J., Lomsadze, A., Borodovsky, M., & Stanke, M. (2019). Whole-genome annotation with BRAKER. In M. Kollmar (Ed.), Gene prediction (pp. 65–95). New York, NY: Springer.
10.1007/978-1-4939-9173-0_5
Google Scholar
Hunter, S., Jones, P., Mitchell, A., Apweiler, R., Attwood, T. K., Bateman, A., Bernard, T., Binns, D., Bork, P., Burge, S., de Castro, E., Coggill, P., Corbett, M., Das, U., Daugherty, L., Duquenne, L., Finn, R. D., Fraser, M., Gough, J., … Yong, S.-Y. (2012). InterPro in 2011: New developments in the family and domain prediction database. Nucleic Acids Research, 40(D1), D306–D312. https://doi.org/10.1093/nar/gkr948
10.1093/nar/gkr948
CAS PubMed Web of Science® Google Scholar
International Aphid Genomics Consortium (2010). Genome sequence of the pea aphid Acyrthosiphon pisum. PLoS Biology, 8(2), e1000313.
10.1371/journal.pbio.1000313
PubMed Web of Science® Google Scholar
Jackman, S. D., Coombe, L., Chu, J., Warren, R. L., Vandervalk, B. P., Yeo, S., Xue, Z., Mohamadi, H., Bohlmann, J., Jones, S. J. M., & Birol, I. (2018). Tigmint: Correcting assembly errors using linked reads from large molecules. BMC Bioinformatics, 19(1), 1–10. https://doi.org/10.1186/s12859-018-2425-6
10.1186/s12859-018-2425-6
PubMed Web of Science® Google Scholar
Julca, I., Marcet-Houben, M., Cruz, F., Vargas-Chavez, C., Johnston, J. S., Gómez-Garrido, J., Frias, L., Corvelo, A., Loska, D., Cámara, F., Gut, M., Alioto, T., Latorre, A., & Gabaldón, T. (2020). Phylogenomics Identifies an ancestral burst of gene duplications predating the diversification of aphidomorpha. Molecular Biology and Evolution, 37(3), 730–756. https://doi.org/10.1093/molbev/msz261
10.1093/molbev/msz261
CAS PubMed Web of Science® Google Scholar
Jurka, J., Kapitonov, V. V., Pavlicek, A., Klonowski, P., Kohany, O., & Walichiewicz, J. (2005). Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and Genome Research, 110(1–4), 462–467. https://doi.org/10.1159/000084979
10.1159/000084979
CAS PubMed Web of Science® Google Scholar
Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–780. https://doi.org/10.1093/molbev/mst010
10.1093/molbev/mst010
CAS PubMed Web of Science® Google Scholar
Kim, D., Langmead, B., & Salzberg, S. L. (2015). HISAT: A fast spliced aligner with low memory requirements. Nature Methods, 12(4), 357–360. https://doi.org/10.1038/nmeth.3317
10.1038/nmeth.3317
CAS PubMed Web of Science® Google Scholar
King, G. J., Alston, F., Battle, I., Chevreau, E., Gessler, C., Janse, J., Lindhout, P., Manganaris, A. G., Sansavini, S., Schmidt, H., & Tobutt, K. (1991). The ‘European Apple Genome Mapping Project’-developing a strategy for mapping genes coding for agronomic characters in tree species. Euphytica, 56(1), 89–94.
CAS Web of Science® Google Scholar
Kulkarni, P. (1984). Chromosomes of seven species of aphids (Homoptera: Aphididae). Bulletin of the Zoological Survey of India, 6, 267–270.
Google Scholar
Kumar, S., Jones, M., Koutsovoulos, G., Clarke, M., & Blaxter, M. (2013). Blobology: Exploring raw genome data for contaminants, symbionts and parasites using taxon-annotated GC-coverage plots. Frontiers in Genetics, 4, 237. https://doi.org/10.3389/fgene.2013.00237
10.3389/fgene.2013.00237
PubMed Google Scholar
Laetsch, D. R., & Blaxter, M. L. (2017). BlobTools: Interrogation of genome assemblies. F1000Research, 6(1287), 1287. https://doi.org/10.12688/f1000research.12232.1
10.12688/f1000research.12232.1
Google Scholar
Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ArXiv Preprint, ArXiv:1303.3997.
Google Scholar
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., & Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25(16), 2078–2079. https://doi.org/10.1093/bioinformatics/btp352
10.1093/bioinformatics/btp352
CAS PubMed Web of Science® Google Scholar
Li, X., Jiang, L., & Qiao, G. (2014). Is the subfamily Eriosomatinae (Hemiptera: Aphididae) monophyletic? Turkish Journal of Zoology, 38(3), 285–297. https://doi.org/10.3906/zoo-1303-15
10.3906/zoo-1303-15
CAS Web of Science® Google Scholar
Li, Y., Zhang, B., & Moran, N. A. (2020). The Aphid X chromosome is a dangerous place for functionally important genes: Diverse evolution of hemipteran genomes based on chromosome-level assemblies. Molecular Biology and Evolution, 37(8), 2357–2368. https://doi.org/10.1093/molbev/msaa095
10.1093/molbev/msaa095
CAS PubMed Web of Science® Google Scholar
Li, Y., Park, H., Smith, T. E., & Moran, N. A. (2019). Gene family evolution in the pea aphid based on chromosome-level genome assembly. Molecular Biology and Evolution, 36(10), 2143–2156. https://doi.org/10.1093/molbev/msz138
10.1093/molbev/msz138
CAS PubMed Web of Science® Google Scholar
Lieberman-Aiden, E., van Berkum, N. l., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., Sandstrom, R., Bernstein, B., Bender, M. A., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L. A., Lander, E. S., & Dekker, J. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326(5950), 289–293. https://doi.org/10.1126/science.1181369
10.1126/science.1181369
CAS PubMed Web of Science® Google Scholar
Lomsadze, A., Burns, P. D., & Borodovsky, M. (2014). Integration of mapped RNA-Seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Research, 42(15), e119. https://doi.org/10.1093/nar/gku557
10.1093/nar/gku557
PubMed Web of Science® Google Scholar
Manzano-Marín, A., & Latorre, A. (2016). Snapshots of a shrinking partner: Genome reduction in Serratia symbiotica. Scientific Reports, 6(1), 32590. https://doi.org/10.1038/srep32590
10.1038/srep32590
CAS PubMed Web of Science® Google Scholar
Mapleson, D., Garcia Accinelli, G., Kettleborough, G., Wright, J., & Clavijo, B. J. (2017). KAT: A K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics, 33(4), 574–576.
10.1093/bioinformatics/btw663
CAS PubMed Web of Science® Google Scholar
Marçais, G., Delcher, A. L., Phillippy, A. M., Coston, R., Salzberg, S. L., & Zimin, A. (2018). MUMmer4: A fast and versatile genome alignment system. PLoS Computational Biology, 14(1), e1005944. https://doi.org/10.1371/journal.pcbi.1005944
10.1371/journal.pcbi.1005944
PubMed Web of Science® Google Scholar
Mathers, T. C. (2020). Improved Genome Assembly and Annotation of the Soybean Aphid (Aphis glycines Matsumura). G3: Genes, Genomes, Genetics, 10(3), g3.400954.2019. https://doi.org/10.1534/g3.119.400954
Web of Science® Google Scholar
Mathers, T. C., Chen, Y., Kaithakottil, G., Legeai, F., Mugford, S. T., Baa-Puyoulet, P., Bretaudeau, A., Clavijo, B., Colella, S., Collin, O., Dalmay, T., Derrien, T., Feng, H., Gabaldón, T., Jordan, A., Julca, I., Kettles, G. J., Kowitwanich, K., Lavenier, D., … Hogenhout, S. A. (2017). Rapid transcriptional plasticity of duplicated gene clusters enables a clonally reproducing aphid to colonise diverse plant species. Genome Biology, 18(1), 27. https://doi.org/10.1186/s13059-016-1145-3
10.1186/s13059-016-1145-3
PubMed Web of Science® Google Scholar
Mathers, T. C., Mugford, S. T., Hogenhout, S. A. T., & Tripathi, L. (2020). Genome sequence of the banana aphid, Pentalonia nigronervosa Coquerel (Hemiptera: Aphididae) and its symbionts. BioRxiv, https://doi.org/10.1101/2020.04.25.060517
Google Scholar
Mathers, T. C., Wouters, R. H. M., Mugford, S. T., Swarbreck, D., Van Oosterhout, C., & Hogenhout, S. A. (2020). Chromosome-scale genome assemblies of aphids reveal extensively rearranged autosomes and long-term conservation of the X chromosome. BioRxiv, https://doi.org/10.1101/2020.03.24.006411
Google Scholar
Moran, N. A., Russell, J. A., Koga, R., & Fukatsu, T. (2005). Evolutionary relationships of three new species of enterobacteriaceae living as symbionts of aphids and other insects. Applied and Environmental Microbiology, 71(6), 3302–3310. https://doi.org/10.1128/AEM.71.6.3302-3310.2005
10.1128/AEM.71.6.3302-3310.2005
CAS PubMed Web of Science® Google Scholar
Nicholson, S. J., Nickerson, M. L., Dean, M., Song, Y., Hoyt, P. R., Rhee, H., Kim, C., & Puterka, G. J. (2015). The genome of Diuraphis noxia, a global aphid pest of small grains. BMC Genomics, 16(1), 1–16. https://doi.org/10.1186/s12864-015-1525-1
10.1186/s12864-015-1525-1
CAS PubMed Web of Science® Google Scholar
Nováková, E., Hypša, V., Klein, J., Foottit, R. G., von Dohlen, C. D., & Moran, N. A. (2013). Reconstructing the phylogeny of aphids (Hemiptera: Aphididae) using DNA of the obligate symbiont Buchnera aphidicola. Molecular Phylogenetics and Evolution, 68(1), 42–54. https://doi.org/10.1016/j.ympev.2013.03.016
10.1016/j.ympev.2013.03.016
PubMed Web of Science® Google Scholar
Ortiz-Rivas, B., & Martínez-Torres, D. (2010). Combination of molecular data support the existence of three main lineages in the phylogeny of aphids (Hemiptera: Aphididae) and the basal position of the subfamily Lachninae. Molecular Phylogenetics and Evolution, 55(1), 305–317. https://doi.org/10.1016/j.ympev.2009.12.005
10.1016/j.ympev.2009.12.005
PubMed Web of Science® Google Scholar
Price, M. N., Dehal, P. S., & Arkin, A. P. (2009). FastTree: Computing large minimum evolution trees with profiles instead of a distance matrix. Molecular Biology and Evolution, 26(7), 1641–1650. https://doi.org/10.1093/molbev/msp077
10.1093/molbev/msp077
CAS PubMed Web of Science® Google Scholar
Price, M. N., Dehal, P. S., & Arkin, A. P. (2010). FastTree 2 – Approximately maximum-likelihood trees for large alignments. PLoS One, 5(3), e9490. https://doi.org/10.1371/journal.pone.0009490
10.1371/journal.pone.0009490
CAS PubMed Web of Science® Google Scholar
Punta, M., Coggill, P. C., Eberhardt, R. Y., Mistry, J., Tate, J., Boursnell, C., Pang, N., Forslund, K., Ceric, G., Clements, J., Heger, A., Holm, L., Sonnhammer, E. l. l., Eddy, S. R., Bateman, A., & Finn, R. D. (2012). The Pfam protein families database. Nucleic Acids Research, 40(Database issue), D290–D301. https://doi.org/10.1093/nar/gkr1065
10.1093/nar/gkr1065
CAS PubMed Web of Science® Google Scholar
Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., & Lopez, R. (2005). InterProScan: Protein domains identifier. Nucleic Acids Research, 33(suppl_2), W116–W120. https://doi.org/10.1093/nar/gki442
10.1093/nar/gki442
CAS PubMed Web of Science® Google Scholar
Robinson, A., & Chen, Y.-H. (1969). Cytotaxonomy of Aphididae. Canadian Journal of Zoology, 47(4), 511–516. https://doi.org/10.1139/z69-090
10.1139/z69-090
Web of Science® Google Scholar
Rock, G. C., & Zeiger, D. C. (1974). Woolly apple aphid infests Malling and Malling-Merton rootstocks in propagation beds in North Carolina. Journal of Economic Entomology, 67(1), 137–138. https://doi.org/10.1093/jee/67.1.137a
10.1093/jee/67.1.137a
Web of Science® Google Scholar
Sandanayaka, W. R. M., Bus, V. G. M., Connolly, P., & Newcomb, R. (2003). Characteristics associated with woolly apple aphid Eriosoma lanigerum, resistance of three apple rootstocks. Entomologia Experimentalis et Applicata, 109(1), 63–72.
10.1046/j.1570-7458.2003.00095.x
Web of Science® Google Scholar
Shen, W., Le, S., Li, Y., & Hu, F. (2016). SeqKit: A cross-platform and µltrafast toolkit for FASTA/Q file manipulation. PLoS One, 11(10), e0163962.
10.1371/journal.pone.0163962
PubMed Web of Science® Google Scholar
Shigenobu, S., & Wilson, A. C. (2011). Genomic revelations of a mutualism: The pea aphid and its obligate bacterial symbiont. Cellular and Molecular Life Sciences, 68(8), 1297–1309. https://doi.org/10.1007/s00018-011-0645-2
10.1007/s00018-011-0645-2
CAS PubMed Web of Science® Google Scholar
Staniland, L. (1924). The immunity of apple stocks from attacks of woolly aphis (Eriosoma lanigerum, Hausmann). Part II. The causes of the relative resistance of the stocks. Bulletin of Entomological Research, 15(2), 157–170.
10.1017/S0007485300031527
Google Scholar
Stanke, M., Diekhans, M., Baertsch, R., & Haussler, D. (2008). Using native and syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, 24(5), 637–644. https://doi.org/10.1093/bioinformatics/btn013
10.1093/bioinformatics/btn013
CAS PubMed Web of Science® Google Scholar
Tarailo-Graovac, M., & Chen, N. (2009). Using RepeatMasker to identify repetitive elements in genomic sequences. Current Protocols in Bioinformatics, 25(1), 4–10. https://doi.org/10.1002/0471250953.bi0410s25
10.1002/0471250953.bi0410s25
Google Scholar
Theobald, F. V. (1920). The woolly aphid of the apple and elm. Journal of Pomology and Horticultural Science, 2(2), 73–92. https://doi.org/10.1080/03683621.1920.11513236
10.1080/03683621.1920.11513236
Google Scholar
Thorpe, P., Escudero-Martinez, C. M., Cock, P. J. A., Eves-van den Akker, S., & Bos, J. I. B. (2018). Shared transcriptional control and disparate gain and loss of aphid parasitism genes. Genome Biology and Evolution, 10(10), 2716–2733. https://doi.org/10.1093/gbe/evy183
10.1093/gbe/evy183
CAS PubMed Web of Science® Google Scholar
Trapnell, C., Williams, B. A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M. J., Salzberg, S. L., Wold, B. J., & Pachter, L. (2010). Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 28(5), 511–515. https://doi.org/10.1038/nbt.1621
10.1038/nbt.1621
CAS PubMed Web of Science® Google Scholar
Wang, Y., Tang, H., DeBarry, J. D., Tan, X., Li, J., Wang, X., Lee, T.-H., Jin, H., Marler, B., Guo, H., Kissinger, J. C., & Paterson, A. H. (2012). MCScanX: A toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Research, 40(7), e49. https://doi.org/10.1093/nar/gkr1293
10.1093/nar/gkr1293
CAS PubMed Web of Science® Google Scholar
Waring, J. (1920). The probable value of trunk circumference as an adjunct to fruit yield in interpreting apple orchard experiments. American Society of Horticultural Science, 17, 179–185.
Google Scholar
Waterhouse, R. M., Seppey, M., Simão, F. A., Manni, M., Ioannidis, P., Klioutchnikov, G., Kriventseva, E. V., & Zdobnov, E. M. (2018). BUSCO applications from quality assessments to gene prediction and phylogenomics. Molecular Biology and Evolution, 35(3), 543–548. https://doi.org/10.1093/molbev/msx319
10.1093/molbev/msx319
CAS PubMed Web of Science® Google Scholar
Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M., & Jaffe, D. B. (2017). Direct determination of diploid genome sequences. Genome Research, 27(5), 757–767. https://doi.org/10.1101/gr.214874.116
10.1101/gr.214874.116
CAS PubMed Web of Science® Google Scholar
Wenger, J. A., Cassone, B. J., Legeai, F., Johnston, J. S., Bansal, R., Yates, A. D., Coates, B. S., Pavinato, V. A. C., & Michel, A. (2016). Whole genome sequence of the soybean aphid, Aphis glycines. Insect Biochemistry and Molecular Biology, 123, 102917. https://doi.org/10.1016/j.ibmb.2017.01.005
10.1016/j.ibmb.2017.01.005
Web of Science® Google Scholar
Yandell, M., & Ence, D. (2012). A beginner’s guide to eukaryotic genome annotation. Nature Reviews Genetics, 13(5), 329–342. https://doi.org/10.1038/nrg3174
10.1038/nrg3174
CAS PubMed Web of Science® Google Scholar
Yeo, S., Coombe, L., Warren, R. L., Chu, J., & Birol, I. (2018). ARCS: Scaffolding genome drafts with linked reads. Bioinformatics, 34(5), 725–731. https://doi.org/10.1093/bioinformatics/btx675
10.1093/bioinformatics/btx675
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume21, Issue1

January 2021

Pages 316-326

Filename	Description
men13258-sup-0001-FigS1-S3.docxWord document, 23.6 MB	Fig S1-S3
men13258-sup-0002-TableS1-S10.xlsxapplication/excel, 1.6 MB	Table S1-S10

A chromosome-level genome assembly of the woolly apple aphid, Eriosoma lanigerum Hausmann (Hemiptera: Aphididae)

Abstract

1 INTRODUCTION