Volume 21, Issue 8 pp. 1642-1658
Research Article
Open Access

Genomic analyses provide insights into the polyploidization-driven herbicide adaptation in Leptochloa weeds

Ke Chen

Ke Chen

State Key Laboratory of Hybrid Rice, Hunan Academy of Agricultural Sciences, Changsha, China

Key Laboratory of Indica Rice Genetics and Breeding in the Middle and Lower Reaches of Yangtze River Valley, Ministry of Agriculture and Rural Affairs, Hunan Rice Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Huangpu Research Institute of Longping Agricultural Science and Technology, Guangzhou, China

Longping Branch, College of Biology, Hunan University, Changsha, China

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

These authors contributed equally to this work.

Search for more papers by this author
Haona Yang

Haona Yang

State Key Laboratory of Hybrid Rice, Hunan Academy of Agricultural Sciences, Changsha, China

Huangpu Research Institute of Longping Agricultural Science and Technology, Guangzhou, China

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

These authors contributed equally to this work.

Search for more papers by this author
Yajun Peng

Yajun Peng

State Key Laboratory of Hybrid Rice, Hunan Academy of Agricultural Sciences, Changsha, China

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Search for more papers by this author
Ducai Liu

Ducai Liu

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Search for more papers by this author
Jingyuan Zhang

Jingyuan Zhang

Qingdao Kingagroot Compounds Co. Ltd, Qingdao, China

Search for more papers by this author
Zhenghong Zhao

Zhenghong Zhao

State Key Laboratory of Hybrid Rice, Hunan Academy of Agricultural Sciences, Changsha, China

Key Laboratory of Indica Rice Genetics and Breeding in the Middle and Lower Reaches of Yangtze River Valley, Ministry of Agriculture and Rural Affairs, Hunan Rice Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Huangpu Research Institute of Longping Agricultural Science and Technology, Guangzhou, China

Longping Branch, College of Biology, Hunan University, Changsha, China

Search for more papers by this author
Lamei Wu

Lamei Wu

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Search for more papers by this author
Tao Lin

Tao Lin

State Key Laboratory of Agrobiotechnology, Beijing Key Laboratory of Growth and Developmental Regulation for Protected Vegetable Crops, College of Horticulture, China Agricultural University, Beijing, China

Search for more papers by this author
Lianyang Bai

Corresponding Author

Lianyang Bai

State Key Laboratory of Hybrid Rice, Hunan Academy of Agricultural Sciences, Changsha, China

Key Laboratory of Indica Rice Genetics and Breeding in the Middle and Lower Reaches of Yangtze River Valley, Ministry of Agriculture and Rural Affairs, Hunan Rice Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Huangpu Research Institute of Longping Agricultural Science and Technology, Guangzhou, China

Longping Branch, College of Biology, Hunan University, Changsha, China

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Correspondence (Tel +86 13908478453; Fax 0731-84691624; email [email protected] (LB) and Tel +86 13574848933; Fax 0731-84691284; email [email protected] (LW))Search for more papers by this author
Lifeng Wang

Corresponding Author

Lifeng Wang

State Key Laboratory of Hybrid Rice, Hunan Academy of Agricultural Sciences, Changsha, China

Key Laboratory of Indica Rice Genetics and Breeding in the Middle and Lower Reaches of Yangtze River Valley, Ministry of Agriculture and Rural Affairs, Hunan Rice Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Huangpu Research Institute of Longping Agricultural Science and Technology, Guangzhou, China

Longping Branch, College of Biology, Hunan University, Changsha, China

Hunan Weed Science Key Laboratory, Hunan Agricultural Biotechnology Research Institute, Hunan Academy of Agricultural Sciences, Changsha, China

Correspondence (Tel +86 13908478453; Fax 0731-84691624; email [email protected] (LB) and Tel +86 13574848933; Fax 0731-84691284; email [email protected] (LW))Search for more papers by this author
First published: 08 May 2023
Citations: 1

Summary

Polyploidy confers a selective advantage under stress conditions; however, whether polyploidization mediates enhanced herbicide adaptation remains largely unknown. Tetraploid Leptochloa chinensis is a notorious weed in the rice ecosystem, causing severe yield loss in rice. In China, L. chinensis has only one sister species, the diploid L. panicea, whose damage is rarely reported. To gain insights into the effects of polyploidization on herbicide adaptation, we first assembled a high-quality genome of L. panicea and identified genome structure variations with L. chinensis. Moreover, we identified herbicide-resistance genes specifically expanded in L. chinensis, which may confer a greater herbicide adaptability in L. chinensis. Analysis of gene retention and loss showed that five herbicide target-site genes and several herbicide nontarget-site resistance gene families were retained during polyploidization. Notably, we identified three pairs of polyploidization-retained genes including LcABCC8, LcCYP76C1 and LcCYP76C4 that may enhance herbicide resistance. More importantly, we found that both copies of LcCYP76C4 were under herbicide selection during the spread of L. chinensis in China. Furthermore, we identified another gene potentially involved in herbicide resistance, LcCYP709B2, which is also retained during polyploidization and under selection. This study provides insights into the genomic basis of the enhanced herbicide adaptability of Leptochloa weeds during polyploidization and provides guidance for the precise and efficient control of polyploidy weeds.

Introduction

Polyploidy is commonly found in plants. In addition to many important crops such as wheat (Triticum aestivum) (Brenchley et al., 2012), cotton (Gossypium hirsutum) (Zhang et al., 2015) and rapeseed (Brassica napus) (Chalhoub et al., 2014), some important weeds such as Leptochloa chinensis (Wang et al., 2022), barnyard grass (Echinochloa crus-galli) (Guo et al., 2017) and Shepherd's purse (Capsella bursa-pastoris) (Kasianov et al., 2017) are also polyploid. Polyploidy not only plays an important role in plant genome evolution and species diversification, but also increases the adaptive plasticity of plants to extreme environments because of their increased genetic variation and the buffering effect of their duplicated genes (Adams and Wendel, 2005; Chao et al., 2013; Doyle and Coate, 2019; Freeling et al., 2015; Guo et al., 2017; Meimberg et al., 2009; Te Beest et al., 2012; Van de Peer et al., 2009, 2017; Wendel, 2015). It has been shown that plant polyploidy has effects on both biotic and abiotic stress responses. Previous studies found that tetraploid garden impatiens (Impatiens walleriana) showed improved resistance to downy mildew (Plasmopara obducens) relative to its diploid counterparts (Wang et al., 2018), and tetraploid Livingstone potato (Plectranthus esculentus) was more resistant to root-knot nematodes than diploids (Hannweg et al., 2016). In terms of abiotic stresses, it has been shown that tetraploid Arabidopsis plants exhibit increased salt tolerance compared with diploids (Chao et al., 2013). Moreover, both tetraploid rice (Oryza sativa) and citrange (Citrus sinensis L. Osb. × Poncirus trifoliata L. Raf.) have an increased tolerance to salt and drought stresses as a result of whole-genome duplication (Ruiz et al., 2016; Yang et al., 2014). However, whether polyploidy mediates enhanced herbicide adaptation remains largely unknown.

Weeds are one of the most extreme survivors on the planet, which can effectively evade human's control and have a strong ability to adapt to the environment (Sharma et al., 2021). Leptochloa chinensis is one of the most notorious weeds in rice ecosystems and has recently become the major weed in direct seeded rice fields in China, which account for ~21% of the total rice production area (Chakraborty et al., 2017). Leptochloa chinensis has a strong environmental adaptability and adaptive plasticity, with many biotypes evolving tolerance to herbicides including cyhalofop-butyl and metamifop, two commonly used herbicides for the control of L. chinensis in rice fields (Chen et al., 2021; Peng et al., 2020; Yu et al., 2017; Zhang et al., 2021). However, due to the lack of suitable closely related plant models, few studies are available on the origin of its herbicide adaptive characteristics. The genus Leptochloa contains around 29 species (http://www.plantsoftheworldonline.org/taxon/urn:lsid:ipni.org:names:18378-1), but only two have been found in China, L. chinensis and L. panicea. Leptochloa chinensis is a tetraploid (2n = 4× = 40) while Leptochloa panicea is a diploid species (2n = 2× = 20). L Leptochloa chinensis and L. panicea display very similar morphological characteristics (Figure S1). Their plants are also very similar to rice in the seedling stage and have a similar life cycle to that of rice. Leptochloa chinensis has strong environmental adaptability, especially herbicide adaptation, in paddy fields and can cause great harm to rice production, while the damage of L. panicea is rarely reported.

Compared with crops, genome studies of weeds still lag far behind. In recent years, with the advances of sequencing technologies and the increasingly serious harm of weeds, genomes of nearly 20 weed species have been assembled (Sharma et al., 2021), but only a few have reached the chromosome level, which leaves many of the genetic mechanisms associated with weed adaptation still not well-resolved. In this study, we assembled the chromosomal-level genome of L. panicea, the sister diploid species of L. chinensis, to examine the mechanisms of herbicide adaptation and patterns of genome evolution during polyploidization. We performed a comparative genomic study to characterize the origin and evolutionary history of Leptochloa weeds and identified subgenome-shared genomic structural variants (SVs) and subgenome-specific SVs and found a shared inversion that may mediate plant defence responses. Next, we performed gene family expansion and contraction analysis and surveyed the gene synteny retention pattern during the polyploidization and found three pairs of polyploidization-retained genes including LcCYP76C1, LcCYP76C4 and LcABCC8 that may confer herbicide-resistance. Moreover, we found that LcCYP76C4 was under herbicide selection during the malignant spread of L. chinensis in China and accordingly screened for another polyploidization-retained LcCYP709B2 gene that was also under selection and could potentially confer herbicide resistance in the plant. This study reveals the genomic basis of the polyploidization-driven herbicide adaptation and provides a solid foundation for future development of novel or improved strategies for efficient management of L. chinensis and other polyploid weeds.

Results

Genome assembly and annotation of L. panicea

Based on k-mer frequency analysis, the L. panicea genome had an estimated size of about 286.51 Mb and a heterozygosity level of 0.047% (Figure S2). The estimated genome size was close to the 280 Mb determined using flow cytometry. We generated a total of 32.73 Gb (133×) PacBio HiFi sequences with a read N50 length of 16.60 kb (Table S1, Figure S3). These HiFi reads were de novo assembled into 348 contigs with an N50 size of 21.96 Mb (Table S2). Using Hi-C data of approximately 97× coverage (Table S1), a total of 235.12 Mb of assembled contigs were clustered into 10 pseudomolecules with sizes ranging from 15.55 Mb to 32.52 Mb (Figures 1a, S4), of which 7 and 3 had telomeric sequences (5′-TTTAGGG-3′) at both and single ends, respectively (Table S3). Evaluation using BUSCO (Simão et al., 2015) indicated that 98.7% of plant conserved orthologs were fully captured by the L. panicea assembly (Table S4). Illumina paired-end reads were mapped back to the assembly, resulting in an overall alignment rate of 99.23%. The LAI score of the assembly was 17.02 (Figure S5). Together, these results suggested the high quality of the L. panicea genome assembly (Table 1).

Details are in the caption following the image
Comparison of the L. panicea, L. chinensis and O. thomaeum genomes. (a) Circos plot of the L. panicea genome. The outermost layer of blocks is a circular representation of the 10 pseudochromosomes, followed by line plots of GC content, gene density, repeat content, LTR content, INDEL density (with At) and INDEL density (with Bt) in 100-kb non-overlapping windows. (b) Genome collinearity between L. panicea, L. chinensis and O. thomaeum. (c, d) Macrosynteny between L. panicea and L. chinensis (c) and between L. panicea and O. thomaeum (d).
Table 1. Statistics of genome assembly
Assembly feature
Contig N50 21.96 Mb
Contig number 348
Assembled genome size 235.12 Mb
Chromosome number 10
BUSCO coverage 98.70%
LAI assembly index 17.02
Gene models
Number of gene models 33 481
Mean coding sequence length 1270 bp
Mean number of exons per transcript 5.2
Mean exon length 320 bp
Mean intron length 388 bp
Non-protein-coding genes
Number of miRNA gene 114
Number of tRNA gene 2751
Number of rRNA gene 8114
Number of snoRNA gene 362
Number of snRNA gene 95

A total of 33 481 gene models were predicted in the L. panicea genome, of which about 99.36% (33267) were assigned to the 10 chromosomes (Table S5). In addition, 2751 tRNA genes, 8114 rRNA genes, 114 microRNA genes (miRNAs), 95 small nuclear RNA genes (snRNAs) and 362 small nucleolar RNA genes (snoRNAs) were predicted in the L. panicea genome (Table 1). BUSCO assessment indicated that 99.2% of plant conserved orthologs were completely covered by the predicted genes (Table S4).

Among the 33 481 predicted proteins, 30 842 (92.12%) were annotated by GenBank nr, 28 957 (86.49%) by eggNOG mapper, 20 638 (61.64%) by Swissprot, 11 023 (32.95%) by the KEGG database and 13 780 (41.16%) were assigned with Gene Ontology (GO) terms. The average GC content of the CDSs of L. panicea (56.06%) was similar to that of L. chinensis (56.90%), O. sativa (55.19%), Sorghum bicolor (55.51%) and Setaria viridis (56.01%), but higher than that of Arabidopsis thaliana (44.17%) (Table S6). The GC content and GC3s (GC of silent 3rd codon position) of CDSs showed a bimodal distribution in L. panicea, consistent with those found in other grasses (Figures S6, S7).

In the L. panicea genome assembly, we identified 66.24 Mb (28.21%) repetitive sequences (Table S7). The long-terminal repeat retrotransposon (LTR-RT) was the most abundant type of repetitive sequences in L. panicea, spanning 21.11 Mb (8.98%) of the genome. A total of 226 intact LTR-RTs were classified, including 129 Gypsy-type and 74 Copia-type LTRs. The largest LTR-RT superfamily Gypsy, comprising ~6.34% of the genome, was concentrated near the putative centromeres (Figure S8A). The other superfamily of LTR-RT, Copia, comprised ~1.37% of the genome (Figure S8B). Other interspersed repeats such as LINEs (Long Interspersed Nuclear Elements) and SINEs (Short Interspersed Nuclear Elements) occupied 1.86% and 0.08% of the genome, respectively (Figure S8C,D).

In addition, centromere regions were identified in nine of the 10 L. panicea chromosomes (except chromosome 6) using Tandem Repeat Finder (Benson, 1999) (Figure S9). Furthermore, we detected the top eight TE subfamilies, including four LTR/Gypsy, two unknown, one LINE/L1 and one LTR/cassandra type subfamilies, which together comprised over 10.65% of the L. panicea genome (Figure S10). Density of these TE subfamilies along the 10 chromosomes showed that only the longest unknown-type rnd-3_family-356 subfamily (6.44% of the genome) was enriched near centromeres but absent from the rest of the genome. These results are consistent with results from Tandem Repeat Finder and complement the identification of the centromere position on chromosome 6 (Figure S9). Overall, we predicted potential centromeric regions on all 10 chromosomes.

Comparison between the L. panicea and L. chinensis genomes

The high-quality genomes of L. panicea provides an opportunity to compare the two subgenomes (At and Bt) of L. chinensis (Wang et al., 2022) with the L. panicea genome to evaluate the effect of polyploidization in Leptochloa weeds. A total of 174.9 Mb syntenic regions between L. panicea and At, and 165.85 Mb between L. panicea and Bt were identified (Figure 1b, Table S8). Every chromosome of L. panicea was basically collinear with two homologous chromosomes of L. chinensis (Figure 1c), consistent with the diploid and tetraploid nature of L. panicea and L. chinensis, respectively. And each chromosome of L. panicea was basically collinear with one chromosome of O. thomaeum, a diploid species in Chloridoideae (Figure 1d).

SVs in gene body and promoter regions can impact gene functions and expression. We found that 6402 predicted genes in L. panicea and 5909 in At had at least one indel in their gene coding sequences or promoter regions, with 3792 in L. panicea and 3333 in At having indels in their coding sequences (Figure S11). In addition, 6144 predicted genes in L. panicea and 5662 in Bt had at least one indel in their gene body or promoter regions, with 3648 in L. panicea and 3271 in Bt having indels in their coding sequences (Figure S12). To explore whether there is a subgenome bias for genomic variation (SV), we found that 175 033 SVs existed only between L. panicea and At, 162162 SVs existed only between L. panicea and Bt, and 53 046 SVs shared by both At and Bt (Figure 2a). Moreover, we found that 9.09% of these SV sequences between L. panicea and At and 6.99% of these SV sequences between L. panicea and Bt were gypsy-like retrotransposons, compared to 6.34% of the entire genome, and 2.10% of these SV sequences between L. panicea and At and 1.74% of these SV sequences between L. panicea and Bt were copia-like retrotransposons, compared to 1.37% of the entire genome. The contents of other types of transposable elements were similar between the SV regions and the entire genome (Figure 2b), suggesting that SVs occurred more frequently in genome regions occupied by gypsy and copia-like retrotransposons.

Details are in the caption following the image
Genomic structural variations between L. panicea and L. chinensis. (a) Venn diagrams showing the number of subgenome-specific and shared SVs. (b) Contents of different categories of transposable elements in SV regions and the entire genome of L. panicea. (c) Inversions on chromosome 2 (upper) and chromosome 9 (bottom). Horizontal lines indicate L. panicea (blue), At of L. chinensis (green) and Bt of L. chinensis (purple) chromosomes. Yellow and grey lines indicate the inverted and syntenic regions, respectively. The inversions are supported by Hi-C heatmaps generated from L. chinensis Hi-C reads aligned to the L. panicea genome (middle), while no corresponding inversions are found in the L. panicea Hi-C reads aligned to the L. panicea genome (right). (d) Gene ontology (GO) term enrichment of genes surrounding breakpoint regions of the large inversions on chromosomes 2 (red) and 9 (black).

Chromosomal rearrangements such as inversions and translocations have long been thought to play a critical role in adaptation and speciation (Dvorak et al., 2018). We identified 128 genome rearrangements (translocations and inversions) between L. panicea and At and 138 between L. panicea and Bt distributing across all 10 chromosomes (Figure S13). Among this, 77 inversions ranging from 2.11 Kb to 4.94 Mb were identified between L. panicea and At, and 74 inversions ranging from 2.03 Kb to 4.94 Mb were identified between L. panicea and Bt. Interestingly, we found two large inversions on chromosomes 2 and 9 shared by both At and Bt (Figures 2c, S13), which were further supported by Hi-C maps and/or Illumina read mapping (Figures 2c, S14–S17). GO term enrichment analysis of the 200 genes surrounding the breakpoint regions of chromosome 2 showed that biological processes such as ‘heterocycle metabolic process’, ‘DNA recombination’, ‘nucleic acid metabolic process’ and ‘DNA integration’ were significantly enriched (Figure 2d), and genes surrounding the breakpoint regions of chromosome 9 were mainly enriched with GO terms related to plant defence, including ‘defense response to fungus’ and ‘defense response to other organism’ (Figure 2d), which suggests that this inversion may mediate the difference in biotic stress tolerance between the two species.

Currently, it is unclear whether L. chinensis is an autotetraploid or an allotetraploid. In our previous study, we found that the two subgenomes of L. chinensis display neither fractionation bias nor overall gene expression dominance, suggesting a possible autopolyploid of L. chinensis (Wang et al., 2022). To further explore this issue, we broke the L. panicea genome into 100-mer fragments and mapped these 100-mers to the two subgenomes of L. chinensis. We found that two subgenomes showed substantially similar genomic similarity with the L. panicea genome (Figure S18), supporting an autopolyploid origin of L. chinensis.

Genome evolutionary of Leptochloa weeds

To investigate the genome evolutionary history of Leptochloa weeds, we first calculated Ks values for homologous gene pairs to identify the whole-genome duplication (WGD) events. According to the Ks peak, we dated the divergence time of L. panicea and L. chinensis at 11.6 million years ago (mya), and the two subgenomes of L. chinensis were separated from L. panicea simultaneously (Figure 3a). We found that L. panicea only experienced one ancient WGD event (ρ) shared by members of the grass family, while L. chinensis experienced an additional WGD event recently (the tetraploidization event), dating back to ~9.88 mya. The results indicated that the time of L. chinensis tetraploidization is very close to the divergence time of L. panicea and L. chinensis, and it is possible that this tetraploidization event had led to the L. chinensis speciation.

Details are in the caption following the image
Whole-genome duplications and gene family expansions and contractions in Leptochloa weeds. (a) Frequency distributions of synonymous substitution rates (Ks) of homologous gene pairs located in the collinearity blocks of L. panicea versus L. panicea, L. chinensis versus L. chinensis and L. panicea versus L. chinensis. Numbers in parentheses indicate peak values of recent L. chinensis WGD and the grass lineage shared ρ WGD. (b) Inferred phylogenetic tree constructed using 549 single-copy orthologous genes shared by the 10 species. Divergence times are indicated in square brackets at the internodes with 95% highest posterior density (HPD). Expanded gene families are indicated in pink pies, while contracted gene families are indicated in green pies. (c) Venn diagram showing shared and unique orthologous groups between L. panicea and L. chinensis. (d) Number of expressed homologous and specific genes in various tissues. (e, f) GO terms in the biological process category enriched in the 3432 expanded gene families of L. chinensis (e) and 3489 expanded gene families of L. panicea (f). Font size correlates to −log10 (P-value). P-values are at bottom right for scale. (g) Sizes of expanded gene families related to stress tolerance in different plant species. Lc-Bt, subgenome B of L. chinensis. Lc-At, subgenome A of L. chinensis. LP, L. panicea. Ot, Oropetium thomaeum. Si, Setaria italica. Sv, Setaria viridis. Zm, Zea mays. Sb, Sorghum bicolor. Os, Oryza sativa. Bd, Brachypodium distachyon. At, Arabidopsis thaliana.

To further investigate the evolutionary relationships between Leptochloa weeds and other grasses, gene family clustering was carried out using Leptochloa weeds, seven other gramineous plants and A. thaliana. A total of 549 single-copy orthologs shared by these plants were identified and used for phylogenetic reconstruction and species divergence time estimation, which showed that O. thomaeum was the closest relative to the Leptochloa weeds, and the divergence of Leptochloa weeds and O. sativa was estimated to occur ~49.3 mya (Figure 3b).

In addition, we identified 21 086 shared gene families among Leptochloa weeds. A total of 901 gene families containing 2383 genes were unique in L. panicea, and 1479 gene families containing 2528 genes were in L. chinensis (Figure 3c). We found that for both shared and specific genes, the highest number of them were expressed in seeds, followed by roots, stems and leaves in L. chinensis. In L. panicea, on the contrary, the highest number of genes were expressed in seeds and stems, followed by roots and leaves (Figure 3d).

To detect the expansion and contraction of gene families, we used protein sequences of the aforementioned 10 species identified 30 737 orthologous groups (gene families). Through comparing gene families among these 10 species, 3432 gene families were found to be significantly expanded in L. chinensis and 3489 to be significantly expanded in L. panicea (Figure 3b). Functional analysis of the 3432 expanded gene families in L. chinensis revealed that a large number of them were involved in regulating various metabolic pathways such as ‘regulation of metabolic process’ (Figure 3e). Notably, metabolism is one of the main mechanisms of herbicide nontarget-site resistance (Gaines et al., 2020; Powles and Yu, 2010). However, functional analysis of the expanded gene families of L. panicea revealed no enrichment of such functions (Figure 3f). Next, we performed domain annotations of these 3432 gene families expanded in L. chinensis and identified 35 cytochrome P450, 25 ABC transporter, three glutathione S-transferase, 13 AP2 domain and six GRAS domain containing gene families that are known to be involved in the regulation of plant abiotic stresses, in particular, herbicide resistance (Figure 3g). These results indicated that expansion of these gene families may have provided a foundation for the adaptation of L. chinensis to the fields, especially in fields managed with herbicide applications. However, L. panicea does not have such basis for adaptation.

Genes loss and gain during the polyploidization process

To gain insights into the effects of polyploidization on herbicide adaptation, we first calculated gene family sizes by identifying protein domains in L. panicea, L. chinensis and S. italica. The results indicated that the sizes of the majority of gene families in L. chinensis were almost twofold of those in L. panicea (Figure 4a,c), and those in L. panicea and S. italica were almost same (Figure 4b,c). Furthermore, we calculated the sizes of several stress tolerance-related gene families in eight grasses and found that sizes of all the stress tolerance-related gene families in L. chinensis were larger than those in L. panicea and O. thomaeum, and sizes of most gene families in L. chinensis were larger than those in most grasses except NB-ARC and AP2 (Figure 4d).

Details are in the caption following the image
Gene loss and retention during Leptochloa polyploidization. (a, b) Scatter plot of gene family sizes in L. chinensis compared with L. panicea (a) and in S. italica compared with L. panicea (a). (c) Distribution of fold changes of gene family sizes in L. chinensis (left), S. italica (right) compared with L. panicea. (d) Comparison of sizes of abiotic and biotic stress-related gene families (including herbicide-resistance and detoxification related gene families) among Leptochloa species and other grasses. CYP450: cytochrome P450; GST: Glutathione S-transferase; ABC trans: ATP-binding cassette transporter; Glyco: glycosyl transferase. (e) Heatmap of expression profiles of five herbicide target-site resistance genes in various tissues of Leptochloa weeds. Gene expression levels were transformed into z-scores. The blue triangle on the right side of the heatmap represents L. panicea, and the red and yellow pentagrams represent At and Bt subgenomes of L. chinensis, respectively. (f) Synteny retention ratio of nine gene families. Across the genome, 58.18% (indicated by the blue line) of genes fit the 1:(1:1) synteny retention ratio. (g) An example of the loss of an NB-ARC domain-encoding gene and the retention of an herbicide resistance ABCC8 gene during L. chinensis polyploidization. Shading between two segments represents synteny. (h, i) Synteny retention of herbicide resistance genes CYP76C1 (h) and CYP76C4 (i) during L. chinensis polyploidization. Shading between two segments represents synteny. (j–l) Phylogenetic trees of ABCC8 (j), CYP76C1(k), CYP76C4 (l) and their homologues in Oryza sativa (light green points), Arabidopsis thaliana (purple points), Solanum lycopersicum (dark green points), Zea mays (yellow points), Sorghum bicolor (orange points) and Setaria italica (brown points). (m) Heatmap of expression profiles of herbicide resistance genes in various tissues of Leptochloa weeds. Gene expression levels were transformed into z-scores. The blue triangle on the right side of the heatmap represents L. panicea, and the red and yellow pentagrams represent L. chinensis At, Bt subgenomes, respectively.

To identify gene loss and gain during L. chinensis polyploidization, we calculated synteny retention ratios of diploid L. panicea genes in tetraploid L. chinensis (i.e. within a gene family, the percentage of gene members with a 1:(1:1) syntenic relationship among L. panicea, the At and Bt of the tetraploid L. chinensis). Across the genome, 58.18% of syntenic genes fit the 1:(1:1) retention ratio, whereas 19.15% and 22.67% fit the 1:(1:0) and 1:(0:1) retention ratios, respectively (Figure S19, Table S9). Herbicide resistance can be divided into target-site and nontarget-site resistances (Powles and Yu, 2010). It has been reported that amplification of herbicide target genes in weed genomes leads to their increased herbicide tolerance (Gaines et al., 2010). To investigate whether herbicide targets were amplified and retained during L. chinensis polyploidization, we identified five herbicide target enzymes, including acetyl-CoA carboxylase (ACCase), acetolactate synthase (ALS), 5-enol-pyruvylshikimate-3-phosphatesynthase (EPSPS), phytoene desaturase (PDS) and protoporphyrinogen oxidase (PPO), in L. panicea and L. chinensis. We found that these five herbicide target enzymes all conformed to the 1:(1:1) synteny retention pattern (Table S10). Further expression analysis indicated that genes encoding these target enzymes in both subgenomes of L. chinensis were expressed at a medium to high level, suggesting that these genes are functional after polyploidization-mediated amplification. The amplification of these target genes during polyploidization may have enhanced the tolerance of L. chinensis to herbicides (Figure 4e, Table S11).

Furthermore, we found that synteny retention ratios for most abiotic stress tolerance gene families were higher than 50%, such as ABC transporter (76.14%), GRAS (77.5%), glycosyl transferase (65.52%) and CYP450 (55.63%). However, GSTs just reached 39.53% and AP2 reached 49.15%. In biotic stress-related gene families, WRKY reached 64.56%, Legume lectin reached 55.56% but NB-ARC just reached 28.30%, indicating an obvious gene loss of NB-ARC (Figure 4f). An example of an NB-ARC gene that deviated from the 1:(1:1) synteny retention ratio is illustrated in Figure 4g. In relation to L. panicea, only one homeologous copy of this NB-ARC gene was retained in L. chinensis (within Bt). Interestingly, we found that the herbicide resistance gene ABCC8 (Pan et al., 2021), a member of the ABC transporter family which could transport glyphosate molecules into the vesicles to protect cells from toxicity (Figure 4j),within the same genome region as the NB-ARC gene, conformed to a 1: (1:1) synthetic retention ratio (Figure 4g,j). In addition, we found that CYP76C1 and CYP76C4 (Höfer et al., 2014), members of the cytochrome P450 family known to confer metabolic resistance to herbicides in plants (Figure 4k,l), also fit this conserved pattern during polyploidization (Figure 4h,i). We next investigated expression patterns of these three genes. Surprisingly, we found that LpABCC8 and LpCYP76C4 were expressed at the highest level in L. panicea seeds, and LpCYP76C1 was expressed at the highest level in stems, while they were all expressed the highest in L. chinensis leaves (Figure 4m). This shift of expression in the primary tissues may contribute to their resistance to metamifop and cyhalofop-butyl, common stem and leaf spray herbicides for L. chinensis weed control. Furthermore, we analysed transcriptome data of cyhalofop-butyl-resistant and cyhalofop-butyl-sensitive L. chinensis lines (Chen et al., 2021; Zhang et al., 2022b) and found that all five genes, except LcABCC8-2, were expressed at significantly higher levels in the resistant L. chinensis than the sensitive L. chinensis (Table S12).

Improvement of herbicide adaptation during L. chinensis polyploidization

To reveal the mechanism by which L. chinensis polyploidization confers greater herbicide adaptability, we conducted homology 3D reconstruction and molecular docking experiments on above three pairs of polyploidization-retained proteins with cyhalofop acid (the active molecules of herbicide cyhalofop-butyl) and metamifop, two common herbicides used to control Leptochloa weeds in paddy fields, to investigate whether these genes play a role in the herbicide environment. For the LcCYP76C4 gene pair (LcCYP76C4–9 and LcCYP76C4–10), we found that metamifop and cyhalofop acid molecules could bind to the binding pocket of LcCYP76C4–9 with the binding energy of −8.48 kcal/mol and −7.07 kcal/mol, respectively. Metamifop molecules formed strong hydrogen bonds with ILE-208 (bond length: 2.9 Å) and ARG-232 (bond length: 2.1 Å) of LcCYP76C4–9 in the binding pocket (Figure 5a), and cyhalofop acid molecules formed strong hydrogen bonds with TRP-114 (bond length: 2.4 Å), SER-99 (bond length: 2.2 Å), PRO-370 (bond length: 3.3 Å), LEU-367 (bond length: 1.9 Å) and PRO-366 (bond length: 3.2 Å) in the pocket (Figure 5a). Both molecules could also bind to the binding pocket of LcCYP76C4–10 with the binding energy of −8.76 kcal/mol and −7.2 kcal/mol, respectively. Metamifop molecules formed strong hydrogen bond interactions with ALA-295 (bond length: 2.8 Å) and ASN-102 (bond length: 2.1 Å and 3.5 Å) of LcCYP76C4–10 in the binding pocket (Figure 5b), and cyhalofop acid molecules formed strong hydrogen bond interactions with ALA-295 (bond length: 2.8 Å) and ARG-384 (bond length: 3.5 Å and 2.1 Å) in the pocket (Figure 5b). The LcCYP76C1 gene pair (LcCYP76C1–5 and LcCYP76C1–6) also had the similar function. We found that metamifop and cyhalofop acid molecules could bind to the binding pocket of LcCYP76C1–5 with the binding energy of −7.39 kcal/mol and − 6.18 kcal/mol, respectively. Metamifop molecules formed strong hydrogen bond interactions with PRO-343 (bond length: 2.3 Å) and ILE-345 (bond length: 2.0 Å) of LcCYP76C1–5 in the binding pocket (Figure S20A), and cyhalofop acid molecules formed strong hydrogen bond interactions with ARG-394 (bond length: 1.7 Å) and ARG-106 (bond length: 2.1 Å) in the pocket (Figure S20A). Metamifop and cyhalofop acid molecules could bind to the binding pocket of LcCYP76C1–6 with the binding energy of −12.02 kcal/mol and −7.63 kcal/mol, respectively. Metamifop molecules formed strong hydrogen bond interactions with ARG-33 (bond length: 2.5 Å) and THR-186 (bond length: 2.1 Å) of LcCYP76C1–6 in the binding pocket (Figure S20B), and cyhalofop acid molecules formed strong hydrogen bond interactions with ARG-33 (bond length: 2.0 Å) in the pocket (Figure S20B). The ABCC8 gene pair (LcABCC8-1 and LcABCC8-2, members of the ABC transporter family) could also be bound by the two herbicide molecules. Metamifop and cyhalofop acid molecules could bind to the binding pocket of LcABCC8-1 with the binding energy of −6.69 kcal/mol and − 6.26 kcal/mol, respectively. Metamifop molecules formed strong hydrogen bond interactions with HIS-495 (bond length: 2.3 Å) of LcABCC8-1 in the binding pocket (Figure S20C), cyhalofop acid molecules formed strong hydrogen bond interactions with ARG-494 (bond length: 1.8 Å) and HIS-495 (bond length: 2.3 Å) in the pocket (Figure S20C). Metamifop and cyhalofop acid molecules could bind to the binding pocket of LcABCC8-2 with the binding energy of −5.8 kcal/mol and −5.0 kcal/mol, respectively. Metamifop molecules formed strong hydrogen bond interactions with ARG-130 (bond length: 2.1 Å) and THR-241 (bond length: 3.6 Å) of LcABCC8-2 in the binding pocket (Figure S20D), and cyhalofop acid molecules formed strong hydrogen bond interactions with ARG-364 (bond length: 2.0 Å), MET-368 (bond length: 3.5 Å and 2.4 Å), GLN-375 (bond length: 3.3 Å) and ARG-274 (bond length: 2.7 Å and 1.7 Å) in the pocket (Figure S20D). We also performed homology 3D reconstruction and molecular docking experiments for LpABCC8, LpCYP76C1 and LpCYP76C4 genes from L. panicea and found that LpCYP76C1 and LpCYP76C4 had lower binding energies to metamifop/cyhalofop acid than their homologues in L. chinensis and that LpABCC8 had a similar binding energy compared with LcABCC8-2, but a much lower binding energy than LcABCC8-1 (Table S13). These results suggested that these genes in L. panicea may possess weaker herbicide resistance functions than those in L. chinensis and doubling of these genes in L. chinensis due to polyploidization could result in a much greater herbicide adaptation plasticity and buffering capacity in L. chinensis.

Details are in the caption following the image
Herbicide selection of polyploidization-retained herbicide resistance genes. (a, b) 3D reconstruction showing structural interactions of LcCYP76C4–9 (a), LcCYP76C4–10 (b) and metamifop (left panel), cyhalofop (right panel). In each dotted red box, binding site amino acids are represented by sticks, and intermolecular contacts are indicated by dotted lines. (c, d) Selection signals on the 20 chromosomes of L. chinensis based on nucleotide diversity ratios (πgroup1group2) (c) and FST (d). Horizontal dashed lines indicate the genome-wide threshold of selection signals. Candidate genes (red) that overlapped with selective sweeps are marked. Group1, L. chinensis accessions from the southern border of China (not yet subject to herbicide selection); Group2, L. chinensis accessions from lower reaches of the Yangtze River (subject to herbicide selection). (e, f) Distribution of nucleotide diversity (π) of the Group1 (Blue) and Group2 (Orange) accessions in genomic regions harbouring LcCYP76C4–10 (e) and LcCYP76C4–9 (f). (g) Venn diagram showing genes under selection detected using π ratios and FST and genes that were retained during polyploidization. (h) Number of abiotic stress-related gene family members shared with 2738 overlapped genes in (g). (i) Heatmap of expression profiles of candidate herbicide resistance genes in various tissues of Leptochloa weeds. Gene expression levels were transformed into z-scores. (j) Local collinearity in the genomic region of Lc_Chr6.g19050 between L. panicea and the At and Bt of L. chinensis. (k, l) Distribution of nucleotide diversity (π) of the Group1 (Blue) and Group2 (Orange) accessions in genomic regions harbouring Lc_Chr5.g15716 (k) and Lc_Chr6.g19050 (l). (m, n) 3D reconstruction showing structural interactions of Lc_Chr6.g19050 and metamifop (m), cyhalofop (n).

Polyploidization-retained herbicide resistance genes under herbicide selection

In a previous study, we have demonstrated that during its spread in China from the southern/southwestern provinces to the middle and lower reaches of the Yangtze River, L. chinensis has developed significantly increased herbicide resistance, accompanied by the selection of numerous genes involved in herbicide resistance (Wang et al., 2022). Interestingly, we found that both copies of LcCYP76C4 were under selection during the malignant spread of L. chinensis in China (Figure 5c,d), where strong herbicide selection has occurred. Their nucleotide diversity in L. chinensis of the southern border of China (not yet subject to herbicide selection) was significantly higher than that in the middle and lower reaches of the Yangtze River (subject to herbicide selection) (Figure 5e,f). These results indicate that the amplification of LcCYP76C4 driven by polyploidization may mediate the herbicide adaptation evolution of L. chinensis in China. We further identified 2738 genes that were not only under selection, but were also retained during the polyploidization process (Figure 5g), including 22 CYP450, seven GST, 11 AP2, 13 ABC transporter and four GRAS genes from families known to be involved in herbicide tolerance (Figure 5h).

We selected four pairs of herbicide resistance family-related genes whose two copies were both under herbicide selection and analysed their expression patterns. Interestingly, the LcCYP709B2 (Lc_Chr5.g15716) gene was highly expressed in leaves (Figure 5i), suggesting that this gene may contribute to the resistance to metamifop and cyhalofop-butyl, common stem and leaf spray herbicides for weed control. We further confirmed the homologous conservative relationships of this gene during polyploidization (Figure 5j) and found that LcCYP709B2–5 may be under more strongly selection than LcCYP709B2–6 (Figure 5k,l).

To test whether this polyploidization-retained LcCYP709B2 gene has the potential to help L. chinensis degrade the commonly used herbicide metamifop and cyhalofop-butyl, we conducted homology 3D reconstruction and molecular docking experiments on this protein. We found that the metamifop molecule could bind to the binding pocket of LcCYP709B2–5 with the binding energy of −10.51 kcal/mol and formed strong hydrogen bond interactions with TRY-338 (bond length: 3.1 Å), PHE-340 (bond length: 2.4 Å) and GLN-341 (bond length: 1.8 Å) in this binding pocket (Figure 5m). The cyhalofop acid molecule could also bind to the binding pocket of LcCYP709B2–5 with binding energy of −7.14 kcal/mol and formed strong hydrogen bond interactions with ASN-78 (bond length: 3.0 Å), TYR-338 (bond length: 3.2 Å and 2.2 Å), PHE-340 (bond length: 2.6 Å and 2.3 Å) and GLN-341 (bond length: 1.9 Å) in the binding pocket (Figure 5n), indicating that in addition to LcCYP76C4, LcCYP76C1 and LcABCC8 mentioned above, the polyploidization-retained LcCYP709B2 gene is also likely to degrade herbicides and plays a role in the malignant spread of L. chinensis in China. Taken together, these results suggest that amplification and retention of herbicide resistance genes mediated by polyploidization in L. chinensis may be the genomic basis that allows the L. chinensis rapid adaptation to field environments with herbicide stress (Figure 6).

Details are in the caption following the image
Proposed model for the evolution of Leptochloa weeds and polyploidization-driven amplification of herbicide resistance genes that confer greater herbicide adaptability in L. chinensis. Leptochloa chinensis diverged from L. panicea about 11.6 mya, and underwent polyploidization. This polyploidization event had driven the amplification of herbicide resistance genes such as CYP76C1, CYP76C4, ABCC8, CYP709B2 that confer greater herbicide adaptability in L. chinensis.

Discussion

Polyploid organisms or polyploid populations are often considered more resilient to extreme environments because of their increased genetic variation and the buffering effect of their duplicated genes (Doyle and Coate, 2019; Van de Peer et al., 2009, 2017), and duplicated genes resulted from polyploidization appear to have been key to crop domestication and the evolution of stress resistance (Renny-Byfield and Wendel, 2014). Polyploids can arise via either autopolyploidy, formed through the duplication of a single diploid species, or allopolyploidy, formed through the hybridization and duplication from two or more distinct species (Alger and Edger, 2020). In order to obtain a more diploid-like state, polyploid genomes will undergo gene loss and chromosome number reduction (Lysak et al., 2006; Wendel, 2015), then sometimes show a dominant subgenome (Schnable et al., 2011). Subgenome dominance were generally absent in autopolyploids but present in allopolyploids (Garsmeur et al., 2014; Zhao et al., 2017). However, recent studies found that some allopolyploid plants also do not display subgenome dominance (Sun et al., 2017; VanBuren et al., 2020). To further determine the polyploid origin of L. chinensis, we split the genomes of L. panicea into k-mers, and mapped these k-mers to the two subgenomes of L. chinensis. The mapping coverage further supported the lack of subgenome dominance in L. chinensis (Figure S18). All extant angiosperms have at least one ancient WGD in their ancestry, with some lineages having experienced several additional rounds of genome doubling or tripling over time. However, the establishment or long-term survival of many of these WGDs is not random, but instead coincides with major periods of global climatic/geologic change and/or periods of mass extinction (Cai et al., 2019; Koenen et al., 2021; Novikova et al., 2018; Van de Peer et al., 2017; Wu et al., 2020). All grasses in gramineae have undergone an ancient ρ WGD, after which the number of chromosomes in each species changed from the seven protochromomes (Murat et al., 2017). We found that there is only WGD event (corresponding to the ancient ρ) in L. panicea, while there is a recent WGD in L. chinensis in addition to this ρ event. Moreover, the divergence time between L. panicea and L. chinensis was very close to the time of the recent WGD in L. chinensis (Figure 3a), suggesting that it is likely that tetraploidization of L. chinensis was accompanied by the species separation from L. panicea. In addition, the two subgenomes segregated from L. panicea almost simultaneously, which is different from the results of other studies in allopolyploids (Ye et al., 2020). Taken together, our results indicated that L. chinensis is highly likely to be autopolyploid.

Chromosomal rearrangements such as inversions and translocations have long been thought to play critical roles in adaptation and speciation (Dvorak et al., 2018). A previous study found that a chromosomal inversion polymorphism is geographically widespread and could contribute to local adaptation, life-history shift and multiple reproductive isolating barriers in monkeyflower (Lowry and Willis, 2010), and another study found an inversion that might be responsible for the difference of adaptability and plant architecture between golden buckwheat and Tartary buckwheat (He et al., 2022). In this study, we identified a number of chromosomal rearrangements between L. panicea and L. chinensis, some of which were shared between the two subgenomes of L. chinensis, indicating that they may have existed prior to tetraploidization. Furthermore, we found that the number of genomic SVs specifically present in one of the two subgenomes did not differ significantly, suggesting that there may be no subgenomic bias for structural variation in Leptochloa weeds. This is in contrast to the finding in allotetraploid cotton, in which more structural variation was found in the At subgenome than Dt (Yang et al., 2019). Among the shared SVs, two large inversions were identified on chromosomes 2 (4.62 Mb) and 9 (4.90 Mb) of L. panicea compared with both subgenomes of L. chinensis, containing 615 genes and 822 genes, respectively (Figure 2c). We found genes near the breakpoint regions of chromosome 9 were mainly enriched with GO terms related to plant defence, including ‘defense response to fungus’ and ‘defense response to other organism’ (Figure 2d), which may lead to the differences in the natural tolerance of the two species to biotic stresses.

Herbicide resistance has now been observed in at least 69 countries, with 251 weed species resistant to 162 different herbicides, covering 23 of 26 targets of action. In total, there are at least 469 separate herbicide resistance events, and the problem of herbicide resistance has been growing over time (http://weedscience.org). Herbicide resistance can be divided into target-site resistance (TSR; resistance conferred by mutations in target genes which are directly inhibited by herbicides) and nontarget-site resistances (NTSR; one or more genes involved in processes that protect the plant from herbicide toxicity. Examples include herbicide uptake, translocation and metabolism) (Kreiner et al., 2018; Powles and Yu, 2010; Sharma et al., 2021). In TSR, in addition to the inability of herbicide-active molecules to act on herbicide-targeted-site genes due to mutations (Achary et al., 2020; Murphy and Tranel, 2019; Zhang et al., 2022a), it has also been found that amplification of herbicide target genes leads to herbicide resistance in plants, which could modulate rapid glyphosate resistance through genome plasticity and adaptive evolution (Gaines et al., 2010). We found that polyploidization in L. chinensis mediated the amplification of at least five herbicide targets and that such amplification was very conservative, with no gene loss (Figure 4e). In addition, we found that many abiotic stress resistance gene families were also expanded following polyploidization, especially those associated with herbicide non-target-site resistance (Figure 4f). Similar findings have been reported in barnyardgrass (Ye et al., 2020), another notorious weed in rice paddy fields. Furthermore, we found that a substantial loss of the NB-ARC family genes involved in disease resistance in both tetraploid L. chinensis and barnyardgrass. Interestingly, several studies have shown that genes conferring resistances can lead to growth loss (Bergelson and Purrington, 1996; Brown and Rant, 2013; Van der Plank, 1963). Both growth and resistance are important for plant development, and the loss of the NB-ARC genes in weeds may indicate that natural selection in weeds may prefer a reduced disease resistance in favour of conferring greater growth and reproductive potential. Herbicide nontarget-site resistance can be divided into enhanced herbicide metabolism and transport herbicides to the extracellular compartment (Powles and Yu, 2010). Many CYP450 genes (Dimaano et al., 2020; Han et al., 2021; Höfer et al., 2014; Iwakami et al., 2014), GST genes (Cummins et al., 1999), ARK genes (Pan et al., 2019) have been shown to enhance the metabolism of herbicides in plants. Recently, an ABC transporter protein have been found to transport herbicide active molecules to the vesicles and thus protect plants from herbicide damage (Pan et al., 2021). In this study, we found that three known herbicide-resistance genes including CYP76C1, CYP76C4 (Höfer et al., 2014) and ABCC8 (Pan et al., 2021) were amplified and retained during polyploidization and confirmed their functions in Leptochloa through 3D reconstruction and molecular docking experiments. Most interestingly, we found that both copies of LcCYP76C4 and a LcCYP709B2 genes were under herbicide selection during the malignant spread of L. chinensis in China (Figure 5c–f). This suggests that it may play a herbicide-resistant role in the malignant spread of L. chinensis infesting rice fields to help weeds escape herbicide damages. In addition to herbicide target-site gene amplification, polyploidization in L. chinensis may also provide greater herbicide adaptability by mediating the amplification of nontarget-site herbicide resistance genes.

In this study, we revealed the relationship between polyploidization and the herbicide adaptation. We reported a high-quality chromosome-level genome of L. panicea, which provides crucial information to gain insights into the effects of polyploidization on environmental adaptation, especially herbicide adaptation. In summary, L. chinensis diverged from L. panicea about 11.6 mya and underwent a polyploidization event, which had driven the amplification of herbicide resistance genes such as CYP76C1, CYP76C4, ABCC8 and CYP709B2, providing enhanced buffering capacity under herbicide stress and greater herbicide adaptability in L. chinensis (Figure 6). This study not only contributes to the better understanding of the origin of Leptochloa weeds, but also reveals the polyploidization-driven extreme herbicide adaptability of L. chinensis as a malignant weed in paddy fields. It will provide new ideas and theoretical references for the control of Leptochloa and other polyploid weeds.

Materials and methods

Plant material

The L. panicea plant used in this study was grown in the greenhouse of Hunan Academy of Agricultural Sciences. The root tip was taken for flow cytometry analysis to determine the genome size of the plant.

DNA/RNA extraction, library construction and sequencing

DNA for genome sequencing was extracted from the young leaves of L. panicea using the cetyltrimethyl ammonium bromide (CTAB) method. DNA libraries for single-molecule real-time (SMRT) PacBio genome sequencing were constructed following the standard protocols of the Pacific Biosciences Company and sequenced on the PacBio Sequel II platform using the circular consensus sequencing (CCS) approach. The same DNA was used to construct an Illumina paired-end library with insert sizes of ~400 bp using the NEBNext Ultra DNA Library Prep Kit following the manufacturer's instructions, and the library was sequenced on the Illumina NovaSeq 6000 platform. A Hi-C library was constructed using the young fresh leaves of L. panicea following the proximo Hi-C plant protocol (Phase Genomics) and sequenced on the NovaSeq 6000 platform.

To assist gene predictions, RNA-sequencing (RNA-Seq) was performed using tissues from root, stem, leaf and seed. Total RNA was extracted from each tissue using the TRIzol reagent based on the recommended protocol (Invitrogen, Carlsbad, California, USA). Strand-specific RNA-Seq libraries were constructed using the Illumina TruSeq RNA Sample Prep Kit and sequenced on the NovaSeq 6000 platform. In addition, RNA from all samples was equally mixed, and the mixed RNA was used to construct one PacBio Iso-Seq library. Briefly, cDNA was first synthesized using the Clontech SMARTer® cDNA Synthesis Kit, and then purified using the AMPure PB beads. The Iso-Seq SMRTbell library was then constructed from the purified cDNA with the SMRTbell Express Template Prep kit 2.0 and sequenced on the PacBio Sequel II platform.

Estimation of genome size and heterozygosity

Genome size of L. panicea was estimated using k-mer frequency distribution derived from Illumina short reads with the program Jellyfish (Marçais and Kingsford, 2011) (version 1.1.10). GenomeScope (Vurture et al., 2017) was used to estimate the heterozygosity level of the L. panicea genome.

Genome assembly

Hifiasm (Cheng et al., 2021) was used to assemble HiFi reads into contigs with default parameters. Redundant contigs/sequences in the assembly were removed using Purge Haplotigs with parameters ‘-l 0 -m 40 -h 175’ (Roach et al., 2018). Hi-C data were then used to scaffold the final assembled contigs into pseudomolecules using the Juicer pipeline (Durand et al., 2016). Potential misassemblies were manually checked and corrected based on Hi-C map, genome synteny and read mapping information.

Genome assembly quality evaluation

BUSCO (Simão et al., 2015) was used to assess the completeness of genome assembly and predicted protein-coding genes. Illumina short reads were mapped to the genome assembly using BWA-MEM (version 0.7.17) (Li and Durbin, 2009), and the mapping rate was calculated. LTR Assembly Index (LAI) (Ou et al., 2018) was also used to evaluate the assembly quality.

Repetitive sequence analysis

The repeat library of L. panicea genome was ab initio constructed using RepeatModeler (version 2.0.1) (http://www.repeatmasker.org/RepeatModeler). The consensus TE sequences generated by RepeatModeler were combined with RepBase and used as repeat library in RepeatMasker (version 4.1.0) (http://www.repeatmasker.org) for repetitive element identification. A preliminary list of candidate LTR-RT was generated using LTR_FINDER (Xu and Wang, 2007) and LTR_harvest (Ellinghaus et al., 2008). The identification of high-quality intact LTR-RTs and the calculation of insertion age for intact LTR-RTs were carried out using LTR_retriever (Ou and Jiang, 2018) with default parameters. Tandem repeats were detected using Tandem Repeats Finder (Benson, 1999). Locations of centromeres and telomeres were inferred from the output generated by Tandem Repeats Finder.

Genome annotation

Protein-coding genes were predicted from the L. panicea genome assembly using an integrated approach. RNA-Seq reads from different tissues were aligned to the genome assembly using HISAT2 (v2.0.4) (Kim et al., 2019) and then assembled into transcripts using Cufflinks (v2.2.1) (Trapnell et al., 2012). Open reading frames (ORFs) in the Iso-Seq transcripts were predicted using PASA (v2.0.1) (Haas et al., 2003), and potential full-length cDNA sequences were then extracted and used as the training dataset for ab initio gene predictors, including AUGUSTUS (v3.03) (Stanke et al., 2006), SNAP (Korf, 2004), GlimmerHMM (Majoros et al., 2004) and GeneMark-ET (Lomsadze et al., 2014). Protein sequences from Leptochloa chinensis, Oryza indica, Setaria italica and Oropetium thomaeum were aligned to the L. panicea assembly for homology-based gene prediction using GeMoMa (v1.4.2) (Keilwagen et al., 2016). Finally, ab initio, homology- and transcript-based predictions were integrated using EVidenceModeler (Haas et al., 2008) to generate a consensus model for each gene. Functional annotations of the predicted genes were performed by comparing their protein sequences against the GenBank non-redundant (nr), InterPro, KEGG and eggNOG databases.

Genome comparison

Genome comparison between L. panicea and the two subgenomes of L. chinensis (Wang et al., 2022; available at the Genome Warehouse of the BIG Data Center under accession number GWHBJVB00000000) was performed via whole-genome alignment using the MUMmer package (Kurtz et al., 2004). MCScan (Wang et al., 2012) (Python version) was used for pairwise synteny region search.

Phylogenetic analysis and divergence time estimation

To investigate the evolutionary history of genus Leptochloa, two subgenomes of Leptochloa chinensis, nine grass subfamilies, Brachypodium distachyon, Oryza sativa, Setaria viridis, Setaria italica, Sorghum bicolor, Zea mays, Oropetium thomaeum, Leptochloa panicea and one dicot plant Arabidopsis thaliana were used for gene family construction using OrthoFinder (Emms and Kelly, 2015) (version 2.3.12) with default parameters. Protein sequences of 549 single-copy orthologs from the 10 species were concatenated for the species tree construction. Protein sequences of each single-copy orthologous group were aligned with ClustalW2 (Larkin et al., 2007). Maximum likelihood tree was constructed using FastTree (Price et al., 2010) with 1000 bootstrap replicates. The divergence time was estimated using MCMCTree (Yang and Rannala, 2006) with branch lengths estimated by BASEML in the PAML package (Yang, 2007) and the independent rate model for time estimation.

To detect whole-genome duplication events in the genus Leptochloa, we calculated nonsynonymous substitution (Ks) values for syntenic homeologous gene pairs using WGDI (Sun et al., 2022). We used the nucleotide substitution rate of 6.5 × 10−9 mutations × bp−1 × generation−1 as a molecular clock (Molina et al., 2011).

Gene family expansion and contraction analysis

Gene-family expansion and contraction in Leptochloa genomes were determined using CAFE (version 4.2.1) (De Bie et al., 2006). The gene family size for each species used in CAFE was calculated with OrthoFinder (Emms and Kelly, 2015) (version 2.3.12). The gene birth and death rate was estimated with orthologous groups that were conserved in all species. To better understand the potential functional category of each gene family, we used KinFin (Laetsch and Blaxter, 2017) (v1.0), along with gene functional annotations assigned by InterProScan and gene ontology, to derive rich annotations for gene families.

RNA sequencing data analysis

RNA-Seq reads were processed to remove adapters and to trim low-quality bases using fastp (Chen et al., 2018) (version 0.21.0). Cleaned reads were mapped to the L. panicea and L. chinensis genome using HISAT2 (Kim et al., 2019) (version 2.1.0) with default parameters. The Stringtie software (Pertea et al., 2015) was then used to calculate the TPM (transcripts per million) values of genes.

Diversity analysis

The nucleotide diversity (π) and population differentiation (FST) values were calculated using VCFtools (version 0.1.13) (Danecek et al., 2011) based on the high-confidence SNPs identified from 89 L. chinensis accessions reported in our previous study (Wang et al., 2022). The π value for each SNP was calculated, and the nucleotide diversity level was measured using a 100-kb window with a step size of 10 kb for each L. chinensis population.

Homology modelling

The NCBI BLAST server was used to select the three PDB proteins with the highest protein similarity as templates. The homology of amino acid sequence was aligned. Homology modelling was carried out using the Modeller (Webb and Sali, 2016) and assigning to the heme subsequently (in case of cytochrome P450 proteins).

Molecular docking

Molecular docking experiments were performed to investigate the binding mode between the proteins and cyhalofop acid/metamifop using Autodock4 (Morris et al., 2009). The 2D structure of the ligands was drawn by ChemBioDraw Ultra and was optimized by the MM2 method using ChemBio3D Ultrasoftware to obtain the 3D structure. The AutoDockTools 1.5.7 package (Morris et al., 2009) was employed to generate the docking input files. For docking, the default parameters were used if it was not mentioned. The best-scoring pose as judged by the AD4 docking score was chosen and visually analysed using PyMoL 1.7.6 (www.pymol.org).

Author contributions

L.B., L.Wang. and K.C. designed and managed the project. H.Y., D.L., Y.P. and Z.Z. contributed to sample collection. J.Z. and L.Wu. performed DNA/RNA extraction. K.C. and T.L. performed genome assembly, annotation and analysis. K.C. and H.Y. contributed to RNA-seq and the associated data analysis. K.C. and H.Y. wrote the manuscript. K.C., L.Wang. and L.B. revised the manuscript.

Acknowledgements

This research was supported by grants from the National Natural Science Foundation of China (No. 32272564), the National Key R&D Program of China (No. 2021YFD1700101), the Science and Technology Innovation Program of Hunan Province (Nos. 2023JJ10025 and 2022RC1017), the Training Program for Excellent Young Innovators of Changsha (kq2106079) and the China Agriculture Research System of MOF and MARA (CARS-16-E19).

    Conflict of interest

    The authors declare no competing interests.

    Data availability statement

    The genomic sequencing reads, RNA-seq data and genome assembly have been deposited into the BIG data centre (https://bigd.big.ac.cn/) under accession number PRJCA010143.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.