Freshwater Biology

Volume 69, Issue 1 pp. 74-83
ORIGINAL ARTICLE
Open Access

Development of genomic resources for cattails (Typha), a globally important macrophyte genus

Alberto Aleman

Corresponding Author

Alberto Aleman

Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada

Correspondence

Alberto Aleman and Joanna R. Freeland, Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada.

Email: [email protected] and [email protected]

Search for more papers by this author
Marcel E. Dorken

Marcel E. Dorken

Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada

Department of Biology, Trent University, Peterborough, Ontario, Canada

Search for more papers by this author
Aaron B. A. Shafer

Aaron B. A. Shafer

Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada

Department of Forensic Sciences, Trent University, Peterborough, Ontario, Canada

Search for more papers by this author
Tulsi Patel

Tulsi Patel

Department of Biology, Trent University, Peterborough, Ontario, Canada

Search for more papers by this author
Polina A. Volkova

Polina A. Volkova

Papanin Institute for Biology of Inland Waters, Russian Academy of Sciences, Borok, Nekouz District, Yaroslavl Region, Russia

Search for more papers by this author
Joanna R. Freeland

Corresponding Author

Joanna R. Freeland

Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada

Department of Biology, Trent University, Peterborough, Ontario, Canada

Correspondence

Alberto Aleman and Joanna R. Freeland, Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada.

Email: [email protected] and [email protected]

Search for more papers by this author
First published: 25 October 2023
Citations: 2

Abstract

  1. A critical knowledge gap in freshwater plant research is the lack of genetic tools necessary to answer fundamental questions about their demographic histories, adaptation and phylogenetic relationships. One example of this is Typha, a global genus of freshwater plants foundational to wetlands that also is becoming an increasingly problematic biological invader in numerous regions worldwide; although important insights have been discovered for this genus, existing markers are insufficient to answer fundamental questions about their demographic histories, adaptation and phylogenetic relationships; to identify introduced and hybrid lineages; and to examine patterns of hybridisation and introgression.
  2. We optimised a library preparation and data processing protocol to develop genome–wide nuclear and plastid resources for studying the evolutionary history, genetic structure and diversity, hybridisation, local adaptation, invasiveness, and geographical expansion dynamics of Typha.
  3. We sequenced 140 Typha samples and identified ~120 K nuclear single nucleotide polymorphisms (SNPs) that differentiate T. angustifolia, T. domingensis and T. latifolia, and retrieved their plastome sequences. We observed genetic introgression among the three species.
  4. Following a fast, straightforward and cost-efficient genomic library preparation protocol, we produced a suite of genome-wide resources to facilitate investigations into the taxonomy and population genetics of Typha and to advance the genomic understanding of wetland plants.
  5. The protocol described, the updated chromosome-level genome assembly of T. latifolia, the catalogue of species-specific SNPs, and the chloroplast sequences produced in this study comprise permanent resources that can be applied to study the genetic composition of multiple populations and hybrid zones, and will be incorporated into future studies of Typha, an ecologically important and globally invasive macrophyte.

1 INTRODUCTION

Freshwater plants are essential to aquatic ecosystems, shaping their habitats' structure and ecological functions (Chambers et al., 2008; Christie et al., 2009; Rejmankova, 2011). Although freshwater plants have been increasingly incorporated into applications that include habitat restoration and invasive species management, they remain highly understudied compared to terrestrial plants (Evangelista et al., 2014; Iversen et al., 2022). One key knowledge gap in freshwater plants research is the lack of genetic tools necessary to answer fundamental questions about their demographic histories, adaptation and phylogenetic relationships (Fay et al., 2019; MarĆ©chal, 2019; O'Hare et al., 2018; Yannelli et al., 2022).

Genetic characterisation of freshwater plants has been hampered by biological and technical challenges, plus biases in scientific research (Matheson & McGaughran, 2022; Troudet et al., 2017). Hundreds to thousands of molecular markers are often required for genomic-based research on topics such as gene flow and adaptation of non-model organisms (da Fonseca et al., 2016; Stapley et al., 2010), but the de novo development of genomic resources can be both time-consuming and expensive (Hu et al., 2020; Ortega et al., 2020; Prieto et al., 2021). Overcoming these challenges is now feasible using novel, rapid and cost-effective methods that capture genome-wide genetic variation, allowing researchers to address questions related to taxonomy, evolution and conservation (Andrews et al., 2016; Goodwin et al., 2016).

Typha L. (cattails) is a global genus of rhizomatous perennial, monoecious, self-compatible, and wind-pollinated freshwater plants foundational to wetlands (reviewed in Bansal et al., 2019). Cattails are a valuable ecosystem resource and play a fundamental ecological role by cycling nutrients, preventing erosion, maintaining stable water levels, and providing food and shelter for wildlife (Andrews & Pratt, 1978; Bonanno & Cirelli, 2017; Dieye et al., 2017; Kimmerer, 2013; Svedarsky et al., 2019). One major challenge in Typha research has been taxonomic identification, which cannot be fully accomplished using morphological characters as a result of their high intraspecific variability and interspecific hybridisation. Consequently, the richness of cattail species, their taxonomy, provenance (i.e., alien or native lineages) and phylogenetic relationships remain unsolved (Ciotir & Freeland, 2016; Volkova & Bobrov, 2022; Zhou et al., 2018). A refined Typha taxonomy along with species-specific genetic markers are necessary to identify introduced and hybrid lineages, which are increasingly documented as invasive, such as T. domingensis Pers. in Central America, T. × glauca Godr. (T. angustifolia L. Ɨ T. latifolia L.) in North America, and T. latifolia L. in Oceania and western Europe, (GISD, n.d.; Bansal et al., 2019; Govaerts, 2004; Hall, 2009; Maldonado, 2019; Xu et al., 2013).

High-throughput sequencing technologies, novel, cost and time-accessible genome library preparations, and the recent assembly of the T. latifolia genome (287.19 Mb) (Goodwin et al., 2016; Rowan et al., 2019; Widanagama et al., 2022) collectively present an opportunity to develop a suite of genomic resources for Typha. In addition to taxonomic resolution, these resources will facilitate investigations of the evolutionary history, genetic structure and diversity, hybridisation and introgression, local adaptation, invasiveness and geographical expansion of this genus. We applied a high-throughput sequencing protocol for enzymatic fragmentation, library preparation, and data processing and produced genome-wide resources for three Typha spp. By optimising the method from Rowan et al. (2019), we generated a catalogue of nuclear single nuclear polymorphsisms (SNPs) that characterise T. angustifolia, T. domingensis and T. latifolia, plus chloroplast genome sequences, in a fast, straightforward and cost=efficient manner.

2 MATERIALS AND METHODS

2.1 Reference genome

We used the reference-based scaffolder Chromosomer 0.1.4a (Tamazian et al., 2016) to align the 1,158 T. latifolia scaffolds from Widanagama et al. (2022) (Genbank accession: JAIOKV000000000.1) with the T. latifolia isolate L0001 (15 chromosomes, GenBank accession: JAAWWQ000000000.1) and produce a local chromosome-level T. latifolia genome. Widanagama et al. (2022) had a significantly higher mapping success of unrelated re–sequenced Typha spp. compared to the isolate L0001, suggesting that it is a more representative Typha genome assembly. The scaffolds were aligned as chromosomes using BLAST+ 2.12.0 (Camacho et al., 2009) with the software default settings, and the alignments were anchored in Chromosomer, establishing a gap length = 0 and a ratio threshold = 1.

2.2 Sampling, DNA extraction, and sequencing

Samples were obtained either from previous studies or from collections across Eurasia, and DNA was extracted at Trent University following published protocols (Bhargav et al., 2022; Ciotir et al., 2017; Pieper et al., 2017, 2020; Tangen et al., 2022; Tisshaw et al., 2020). Briefly, leaf tissue was dried in desiccant silica beads and stored at āˆ’20°C. Dried leaf material was ground with a RetschĀ® MM300 mixer mill. DNA was extracted from 25 to 30 mg of semi-fine powder of each sample using the EZNA Plant DNA kit (Omega-Bio Tek) or the Fastpure plant DNA isolation mini kit (Nanjing Vazyme Biotech-China) protocols for dried material, with a final elution of 100 μl. We obtained DNA from 38 T. angustifolia, 25 T. domingensis and 77 T. latifolia samples (n = 140) (Figure 1; Table S1) previously identified to taxon using a combination of genetic analyses of microsatellite loci (Kirk et al., 2011; Snow et al., 2010) and morphological characteristics (Grace & Harrison, 1986; Smith, 1967). Extracted DNA was quantified using a Qubit fluorometer (Thermofisher Scientific) and calculated as the mean of three independent readings for each sample. All samples were either standardised to 2 ng/μl by dilution with nuclease–free water or left undiluted if at concentrations less than 2 ng/μl (0.4–1.9 ng/μl).

Details are in the caption following the image
Genetic structure results of 12,177,703 nuclear SNPs obtained for three Typha spp. Top: Sampling locations in this study. Black points indicate approximate sampling sites and coloured areas indicate the taxa identified. The number of samples is not shown. Left: principal component analysis for PC1 and PC2. Shapes represent individuals, and colours represent taxa, as in the box. Right: Neighbour–joining tree. Branches represent individuals, and colours indicate species belonging as labelled. Bottom: ADMIXTURE (K = 3). Vertical bars represent individuals, and the admixture proportion is shown with different colours.

Rowan et al. (2019) reported a relatively rapid and cost-effective library preparation technique for genomic sequencing by enzymatic fragmentation followed by ligation of short adapter sequences (i.e., tagmentation) using transposases (Nextera XT) and relatively low yields of DNA. Our protocol was based on this method with a few modifications. Firstly, each DNA sample was tagmented with the Illumina Tagment DNA enzyme (TD) and buffer kit (small kit, #20034210). As the ratio of TD enzyme to DNA is crucial for the reaction, we initially followed Rowan et al. (2019) and subsequently optimised the reagent volumes for our library preparation as 5.5 μl of 5Ɨ TD buffer, 0.5 μl of 1Ɨ TD enzyme and 4 μl of DNA (standardised or undiluted), keeping all reagents on ice during the preparation. Samples were incubated at 55°C for 10 min and left at room temperature for 5 min, and 5 μl of each sample were run on an agarose gel to confirm the efficacy of the tagmentation reaction, evidenced by visible smears. Secondly, the tagmented DNA was then amplified using unique dual indexing based on combinations from a total of 24 N7 (47 bases) and 8 S5 (51 bases) adapters (Alpha DNA). The PCR cocktail included 0.2 μm of each index, 0.5 U of KAPA HiFi HotStart DNA polymerase (Roche), 12.5 μl of 5Ɨ KAPA reagent, 5 μl of tagmented DNA and 6.5 μl of nuclease-free water to a final volume of 25 μl. The PCR cycle comprised 72°C (3 min); 95°C (30 s); and 14 cycles of 95°C (10 s), 55°C (30 s) and 72°C (30 s). Once again, visible smears confirmed amplification success after running 5 μl of the PCR product on an agarose gel; then, 10 μl of each sample were pooled, and the remaining PCR products were stored at āˆ’20°C. The pooled library was purified with a QIAquick PCR purification kit (QIAGEN) following the manufacturer's protocol, with a final elution in 50 μl of elution buffer. The library was quantified using a D1000 Tapestation assay (Agilent Technologies) and a Qubit fluorometer (Thermo Fisher Scientific). A quality-control paired-end sequencing was executed using a Miseq (151 bp) to ensure the genomic library was compiled successfully. Finally, paired-end sequencing was performed on a Novaseq 6000 (126 bp) at The Centre for Applied Genomics (Toronto, Ontario).

2.3 Raw data processing, filtering and SNP-calling

The quality of the demultiplexed raw sequences was evaluated using FastQC 0.11.9 (Andrews, 2017) and MultiQC 1.14 (Ewels et al., 2016). Read pairing and adapter pruning were carried out with trimmomatic 0.39 (Bolger et al., 2014), removing any cleaned reads shorter than 100 bp. Paired and remaining unpaired reads were mapped to our chromosome-level T. latifolia nuclear genome plus the T. latifolia plastome reference (Genbank accession no. NC_013823.1) using the mem module of BWA 0.7.17 (Li & Durbin, 2009). Mapped reads from Miseq and Novaseq 6000 sequencers were merged, and mapping statistics were evaluated with the flagstat and coverage modules of SAMtools 1.15.1 (Li et al., 2009).

Genotype-calling was performed with ANGSD 0.93 (Korneliussen et al., 2014) following the SAMtools model, retrieving hard-called SNPs with a minimum p-value of 1eāˆ’6, minimum mapping and sequencing qualities of 20, discarding any indels and triallelic sites, and outputting a binary Variant Call Format file (-doGeno 4 -gl 1 -skipTriallelic 1 -SNP_pval 1e-6 -minMapQ 20 -minQ 20 -doMajorMinor 1 -domaf 1 -doPost 1 -doBcf 1).

For the nuclear analyses, SNPs with more than 50% missing data across all samples and sites mapped to the plastome were removed in VCFtools 0.1.16 (Danecek et al., 2011). We did not apply any additional filters to SNP identification: our samples represent a broad geographical sampling (Figure 1; Table S1) and thus were not expected to be in Hardy–Weinberg equilibrium; additionally, as allele frequencies were unlikely to be representative of regional allele frequencies, we did not apply a minor allele frequency filter; neither did we filter for linkage equilibrium, as eliminating alleles that are in linkage disequilibrium is likely to decrease the resolution to detect hybridisation and introgression (Alexander, 2020; Pearman et al., 2022).

2.4 Genetic structure and diagnostic markers

We used nuclear SNPs to assess the most likely number of genetic clusters across all samples and the membership of each plant to these clusters using three complementary approaches: (i) ADMIXTURE 1.3.0 (Alexander & Lange, 2011) was run with K = 1–10, and the optimal number of clusters was chosen via the cross-validation procedure, (ii) a neighbour–joining tree from the samples' pairwise genetic distance matrix (expressed as allele counts, transformed on the R 4.2.1 package ape 5.7-1 [Paradis et al., 2004; R Core Team, 2022]), and (iii) a principal component analysis (PCA) were performed with Plink 1.90 (Purcell et al., 2007). To avoid over- or under-estimating genetic structure (Janes et al., 2017), we verified that the assignment of samples to genetic clusters (corresponding to three species, see Section 3) was consistent for each approach.

Potential introgression was tested by running ADMIXTURE (K = 1–5) on three datasets, each comprising a combination of two genetic clusters, using only those SNPs that remained variable based on the two species being compared. We confirmed that the cross-validation procedure for the runs of each species' pair chose the optimal number of clusters as two (K = 2) and used the admixture proportion (Q score) as an index of potential introgression of each sample. Applying Senn and Pemberton (2009) and Smith et al. (2018) thresholds, individuals whose Q score was 0.05 ≤ Q ≤ 0.95, were considered as genetically introgressed.

To compare the levels of differentiation between clusters, values of Weir and Cockerham's genetic differentiation (FST; Weir & Cockerham, 1984) and genetic divergence (dXY; Nei & Miller, 1990) for every variable site in 10 Kbp windows between species pairs were computed with pixy 1.2.7 (Korunes & Samuk, 2021), and the means were calculated. Species-specific SNPs were identified (i) for each species' pair and (ii) by running three paired comparisons of one species versus the other two on each run, using DiagnoSNPs 1.0 (Arce-ValdĆ©s, 2022). We removed genetically introgressed individuals before estimating levels of genetic differentiation and identifying species-specific SNPs.

2.5 Chloroplast genome reconstruction and phylogenetic analysis

We implemented a reference-guided workflow to reconstruct whole-chloroplast–genome sequences. Nucleotide calling was performed individually for each of the 140 samples in ANGSD, using the reads that mapped to the plastome reference, requiring minimum mapping and base qualities of 20, and using Ns for missing data (āˆ’dofasta 2 -minMapQ 20 -minQ 20 -doCounts 1). The sequences were aligned to the chloroplast genomes of T. przewalskii Skvortsov, T. lugdunensis P. Chabert, T. orientalis C. Presl, and Sparganium natans (GenBank accession nos: NC_061354.1, NC_061353.1, NC_050678.1 and NC_058577.1), following Smith et al. (2021) by applying MAFFT 7.0 default settings (i.e., with the flag –nwildcard) (Katoh et al., 2019). Snp-sites 2.5.1 (Page et al., 2016) was used to remove regions of the genome with ambiguous positions, gaps and missing data, such that if any of those were found in a sequence, that position was removed for all sequences. Nucleotide diversity (Ļ€) was calculated in the R package pegas (Paradis, 2010).

The phylogenetic relationships of the chloroplast genome sequences were reconstructed in RAxML-NG 1.1 (Kozlov et al., 2019). Model selection was based on jmodeltest 2.1.10 results (Darriba et al., 2012) using the Akaike information criterion. RAxML was run under a TVM + G4 + I model with the automatic and thorough bootstrap options, starting from 100 random trees and employing Sparganium natans as the outgroup. The best-scoring ML tree was visualised.

3 RESULTS

3.1 Genome scaffolding, mapping statistics, and genotyping

Approximately 99.81% of the scaffold sequences were aligned to the template genome. The scaffolds were anchored to 15 chromosomes, producing a genome of 285.11 Mb (GenBank accession no. JAIOKV000000000.2). The total assembled size was comparable to the T. latifolia genome sizes of Widanagama et al. (2022) (287.19 Mb) and the isolate L0001 (214.13 Mb). Updating the chromosome-level genome assembly simplified our downstream analyses while keeping the highest mapping success of unrelated re-sequenced Typha spp., and facilitating an accurate recombination map for future studies of speciation, hybridisation and the genomic landscape of introgression in Typha.

After quality control, 982 M clean paired-end reads were retained, and ~ 98% mapped to the reference genome. With minimum mapping and sequencing qualities = 20, the average depth and breadth of coverage for the nuclear sequences were 4Ɨ and 42%, respectively. Over 60% of the plastome breadth was covered across all samples (mean depth = 711Ɨ), enabling us to use 96,591 bp for the phylogenetic reconstruction. We assembled 12,177,703 bi-allelic nuclear SNPs across the 140 Typha samples (7,122,151 with a MAF >0.05). The total genotyping rate––the mean proportion of samples with data for each SNP––was 0.68.

3.2 Genetic structure and diagnostic markers

The admixture analysis, the PCA and the neighbour–joining tree each established the most likely number of genetic clusters as three (K = 3) (Figure 1): in line with previous taxonomic identifications, 38 samples were identified within the T. angustifolia cluster (15 of which had T. latifolia introgression, and five of which had both T. domingensis and T. latifolia introgression); 25 samples were in the T. domingensis cluster (one with T. angustifolia introgression, 12 with T. latifolia introgression, and three with both T. angustifolia and T. latifolia introgression); and 77 samples were in the T. latifolia cluster (one with both T. angustifolia and T. domingensis introgression, and one with T. angustifolia introgression). Using only the 18 T. angustifolia, nine T. domingensis and 75 T. latifolia non-introgressed samples, the mean pairwise interspecific FST and dXY values ranged from 0.25 to 0.49 and 0.28 to 0.35, respectively, with T. latifolia showing the highest differentiation from both T. angustifolia and T. domingensis. We identified 119,324 nuclear species-specific SNPs by pairwise comparisons between the three species and 16,856 SNPs when one species was compared to the other two (Table 1).

TABLE 1. Mean Weir and Cockerham's pairwise genetic differentiation (FST) and genetic divergence (dXY) measured in 10-kbp windows for every variable site, number of SNPs and diagnostic markers (SNPs with fixed opposite alleles) between three Typha spp. When only one taxon is shown, the number of diagnostic markers represents the SNPs found when that species was compared to the other two.
Pairwise comparison F ST d XY SNPs Diagnostic SNPs
T. angustifolia – T. domingensis 0.25 0.28 10,358,977 33,436
T. angustifolia – T. latifolia 0.44 0.35 8,786,870 30,113
T. domingensis – T. latifolia 0.49 0.34 9,380,607 55,775
T. angustifolia – – – 10,537
T. domingensis – – – 3838
T. latifolia – – – 3044

3.3 Chloroplast genome reconstruction and phylogenetic relationships

After removing all ambiguities and missing data from the chloroplast genomes, we were left with an alignment of 96,591 bp across all 140 sequences and the four references, with 4916 segregating sites and π = 0.003. The phylogenetic reconstruction was congruent with the nuclear genetic structure results (i.e., individuals were consistently assigned to the same nuclear and plastid lineages), grouping the 143 Typha samples into three lineages, with T. angustifolia in one clade, T. domingensis and T. orientalis sharing another, and T. latifolia and T. przewalskii in a third (Figure 2). All interspecific nodes were strongly supported.

Details are in the caption following the image
Chloroplast phylogeny of 140 samples from three Typha spp. and four NCBI references, based on 96,591 bp. The tree was produced with Sparganium natans as the outgroup and drawn without it. Branches represent individuals and colours indicate species belonging as labelled. Numbers indicate branch support ≄60. Branch lengths are not shown.

4 DISCUSSION

We aimed to produce a suite of genome-wide resources to facilitate investigations into the taxonomy and population genetics of Typha and to advance the genomic understanding of wetland plants. Following a fast, straightforward and cost-efficient genomic library preparation protocol (Rowan et al., 2019), we sequenced 140 Typha samples, obtaining an average breadth of 42% of the nuclear genome, characterising 119,324 nuclear SNPs that collectively differentiate three Typha spp., and producing chloroplast sequences with a breadth of coverage >60% per sample. With a cost of <US$15 per sample and a processing time of 2 h for the library preparation, our workflow is a rapid and cost-effective protocol that can be applied in population genomic studies for investigating levels of genetic diversity and differentiation, identifying conservation units and alien taxa, and investigating hybrid zones, among other purposes. Additionally, our results reveal the feasibility of reconstructing whole-chloroplast-genome sequences as a by-product of the enzymatic fragmentation for high-throughput sequencing libraries, making plastome research simpler and inexpensive for species with an available reference.

Three genetic clusters were identified from both nuclear and chloroplast genomes, corresponding to T. angustifolia, T. domingensis and T. latifolia, and genetic introgression was detected among the three species. There are some reports of hybridisation between T. domingensis and either T. angustifolia or T. latifolia (Ciotir et al., 2017; Govaerts, 2004; Smith, 1967), yet range-wide surveys are lacking. Future research should address the extent to which hybridisation is shaping the genetic differentiation and diversity of these three species. Furthermore, T. domingensis is increasingly invading regions in Nigeria (Ringim et al., 2016), Costa Rica (Trama et al., 2017) and North America, potentially expanding its range across the latter (Spencer & Vincent, 2013; Zhang et al., 2008). However, the taxonomic identity of these plants is unclear––Are they hybrids, non-native lineages, native lineages responding to environmental change, or misidentified T. angustifolia (Bansal et al., 2019)? By characterising SNPs that differentiate T. domingensis, we provide a valuable resource to answer this and other questions across the evolutionary and hybridisation history of Typha.

The markers that differentiate T. angustifolia from T. latifolia will have important applications in North America, where the two species interbreed across a large area and produce an invasive interspecific hybrid (T. × glauca) that dominates wetlands, alters nutrient cycling and reduces biodiversity across the Great Lakes Region (Bansal et al., 2019); additionally, this hybrid is expanding throughout the Prairie Pothole Region, causing native plant diversity to decrease in invaded potholes (Jones et al., 2023), and may impact essential habitat for millions of breeding and migratory waterfowl species (Tangen et al., 2022). Until now, molecular resources to characterise T. angustifolia, T. latifolia and T. × glauca were limited to sets of relatively few individual markers that have produced important insights: RAPDs, chloroplast DNA sequences and codominant SSR loci have contributed to exposing the sexual fertility of first-generation hybrids (F1s) (Snow et al., 2010); asymmetric hybridisation, with T. angustifolia being mainly the maternal parent (Ball & Freeland, 2013; Kuehn et al., 1999; Pieper et al., 2017); overall comparable levels of sexual and clonal reproduction in parents and F1s (Pieper et al., 2020; Travis et al., 2011) heterosis in F1s (Bunbury-Blanchette et al., 2015; Travis et al., 2010; Zapfe & Freeland, 2015), a high frequency of F1s in natural populations (Kirk et al., 2011; Travis et al., 2010), the capability of F1s to backcross, plus partial sterility in F1s coupled with hybrid breakdown of F2s and advanced-generation hybrids (Bhargav et al., 2022; Pieper et al., 2017). However, critical inquiries remain unresolved because existing markers are insufficient to expose the prevalence of advanced-generation hybrids and backcrosses in wild populations. The expansive suite of SNPs identified in this study will facilitate investigations into the extent of hybridisation, hybrid breakdown dynamics and adaptive introgression across the T. × glauca hybrid zones, allowing researchers to understand the processes shaping this genus speciation and species boundaries, and to inform conservation and management strategies.

Fundamental genetic tools are essential for investigating freshwater plants' biology, management and conservation (O'Hare et al., 2018). The protocol described in this paper, the updated chromosome-level genome assembly of T. latifolia, the catalogue of species-specific SNPs, and the chloroplast sequences produced for each sample comprise permanent resources that can be applied to study the genetic composition of multiple populations and hybrid zones. Genome-wide sequencing techniques and reference-based chloroplast genome assemblies are promising tools to clarify the demographic histories, dispersal, adaptation and taxonomy of multiple congeneric macrophyte species (Russello et al., 2015; Straub et al., 2012), and substantial genome-wide research will allow us to tackle these and other knowledge gaps in Typha and other freshwater taxa.

AUTHOR CONTRIBUTIONS

Conceptualisation: Aaron B. A. Shafer, Joanna R. Freeland and Marcel E. Dorken. Developing methods, conducting the research, data interpretation and writing: Alberto Aleman, Aaron B. A. Shafer, Joanna R. Freeland, Marcel E. Dorken, Polina A. Volkova and Tulsi Patel. Data analysis and preparation figures & tables: Alberto Aleman.

ACKNOWLEDGEMENTS

We acknowledge that the laboratory procedures and data analyses were conducted at Trent University, which is on the traditional territory of the Mississauga Anishinaabeg. The Natural Sciences and Engineering Research Council of Canada (NSERC) financially supported this work, and Alberto Aleman is funded by the Environmental & Life Sciences Graduate Program at Trent University. The work of Polina A. Volkova was supported by the Russian Science Foundation grant no. 23-14-00115. We thank V. Bhargav, N. Tikhomirov and M. Ivanova for providing plant tissue samples; T. Pimenov, M. Aksyonova and the staff of the Dagestansky Nature Reserve, in particular, G. S. Dzhamirzoyev, for their help in the field, AO ā€œIEPIā€ for organising fieldwork in Krasnodar Region, and SHARCNET and Compute Canada for providing computational resources. Finally, we thank Camille Kessler for her comments on the manuscript and Enrique Ruiz for his work in Figure 1 (Top).

    FUNDING INFORMATION

    The Natural Sciences and Engineering Research Council of Canada (NSERC) financially supported this work, and Alberto Aleman is funded by the Environmental & Life Sciences Graduate Program at Trent University. The work of P.A.V. was supported by the Russian Science Foundation grant no. 23-14-00115.

    CONFLICT OF INTEREST STATEMENT

    The authors declare no conflict of interest.

    DATA AVAILABILITY STATEMENT

    The updated nuclear genome assembly will be submitted to GenBank (JAIOKV000000000.2). Code, species-specific SNP locations (chromosome, position) and chloroplast-genome sequences produced in this study will be available at https://gitlab.com/WiDGeT_TrentU/graduate_theses/-/tree/master/aleman/fwb and https://github.com/al-aleman/totoras_fwb. High-throughput sequencing raw data and Variant Call Format files are available from the authors upon reasonable request.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.