ORIGINAL ARTICLE

Open Access

Large-scale validation of 46 invasive species assays using an enhanced in silico framework

Corresponding Author

John A. Kronenberger

[email protected]

orcid.org/0000-0003-3588-7572

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Correspondence

John A. Kronenberger, National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, 800 East Beckwith Avenue, Missoula, MT 59801, USA.

Email: [email protected]

Search for more papers by this author

Taylor M. Wilcox,

Taylor M. Wilcox

orcid.org/0000-0003-3341-7374

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Michael K. Young,

Michael K. Young

orcid.org/0000-0002-0191-6112

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Daniel H. Mason,

Daniel H. Mason

orcid.org/0000-0002-2976-072X

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Thomas W. Franklin,

Thomas W. Franklin

orcid.org/0000-0003-1658-4664

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Michael K. Schwartz,

Michael K. Schwartz

orcid.org/0000-0003-3521-3367

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

John A. Kronenberger,

Corresponding Author

John A. Kronenberger

[email protected]

orcid.org/0000-0003-3588-7572

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Correspondence

John A. Kronenberger, National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, 800 East Beckwith Avenue, Missoula, MT 59801, USA.

Email: [email protected]

Search for more papers by this author

Taylor M. Wilcox,

Taylor M. Wilcox

orcid.org/0000-0003-3341-7374

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Michael K. Young,

Michael K. Young

orcid.org/0000-0002-0191-6112

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Daniel H. Mason,

Daniel H. Mason

orcid.org/0000-0002-2976-072X

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Thomas W. Franklin,

Thomas W. Franklin

orcid.org/0000-0003-1658-4664

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

Michael K. Schwartz,

Michael K. Schwartz

orcid.org/0000-0003-3521-3367

National Genomics Center for Wildlife and Fish Conservation, USFS Rocky Mountain Research Station, Missoula, Montana, USA

Search for more papers by this author

First published: 24 April 2024

https://doi.org/10.1002/edn3.548

Citations: 2

Share a link

Email
Wechat
Bluesky

Abstract

The need for widespread occurrence data to inform species conservation has prompted interest in large, national-scale environmental DNA (eDNA) monitoring strategies. However, targeted eDNA assays are seldom validated for use across broad geographic areas. Here, we validated 46 new and previously published probe-based qPCR assays targeting invasive species throughout the continental United States. We drew upon current taxonomies, range maps, publicly available sequences, and tissue archives to evaluate all potentially sympatric confamilial species and genetically similar extrafamilial taxa. Out of 5276 unique assay-nontarget taxon combinations, we were able to test 4206 (80%). We characterized levels of validation and specificity for each of eight federal geographic regions and provided an online tool with state-level information, as well as detailed assay descriptions in an appendix. Specificity testing benefited from extensive use of eDNAssay—a machine learning classifier trained to predict qPCR cross-amplification—which we found to be 96% accurate in 649 unique tests that underwent paired in silico and in vitro testing. Predictions of assay specificity (the true negative rate) were 98–100% accurate, depending on the classification threshold used. This work provides both an immediate resource for invasive species surveillance and demonstrates an enhanced in silico, geographically subdivided validation framework to aid in future large-scale validation efforts.

1 INTRODUCTION

Biomonitoring capacity has grown tremendously with the advent of environmental DNA (eDNA) sampling (Rees et al., 2014; Schenekar, 2023; Takahashi et al., 2023). Local and regional eDNA surveys are now commonplace, reflecting broader recognition of eDNA data integrity (Laschever et al., 2023; Sepulveda et al., 2020) and setting the stage for implementation across much larger geographic scales. Indeed, several nationally and internationally coordinated eDNA sampling initiatives are currently active or in development (Kelly et al., 2023; Lodge, 2022). Depending on project-specific goals and funding, national-scale strategies are likely to rely heavily upon targeted probe-based hydrolysis assays, which most studies suggest are more sensitive than semi-targeted techniques (e.g., metabarcoding; but see McCarthy et al., 2023), cost effective for detecting small sets of predetermined taxa, and amenable to standardization (Langlois et al., 2021; Thalinger et al., 2021). Furthermore, targeted assays developed for one platform (e.g., qPCR) can typically be used on other platforms (e.g., high-throughput qPCR [HT-qPCR; Wilcox et al., 2020] or droplet digital PCR [ddPCR; Nathan et al., 2014]) with little or no modification (Thalinger et al., 2021). There are now over 500 targeted assays to choose from (Takahashi et al., 2023; Thalinger et al., 2021). However, we believe very few assays are ready for nationwide application.

Initial validation ought to have tested assay specificity against all closely related sympatric species (Langlois et al., 2021; Thalinger et al., 2021; Wilcox et al., 2013, 2015), but what constitutes “sympatric” is highly project-dependent. Most assays are validated for use at local or regional scales, with specificity confirmed only against the relatively small suite of nontarget taxa present in the original study area. Spatially and taxonomically broad queries of reference databases (e.g., Primer-BLAST; Ye et al., 2012) are useful screening tools but frequently miss problematic nontargets (So et al., 2020). Therefore, it is important to carefully revisit the validation done for each assay prior to use; additional validation may be necessary if any closely related species are potentially present in the new study area and remain untested. Failure to do so can lead to false negative or false positive detections, missed or unnecessary management actions, and reduced confidence in eDNA as a monitoring tool. Additional validation is particularly important and difficult for assays used in large-scale eDNA surveys, as the number of nontarget taxa increases with geographic area. Comprehensive testing can become prohibitively difficult for biodiverse taxa at national, continental, and global scales.

In particular, in vitro testing can impose validation bottlenecks as developers attempt to obtain tissue samples from highly endemic or rare species or purchase expensive synthetic gene fragments. In silico testing via base-pair mismatch “rules of thumb” and thermodynamic estimates of binding affinity can predict specificity (Noguera et al., 2014), but not accurately enough to replace in vitro testing (Kronenberger et al., 2022; So et al., 2020), instead being used primarily in the assay design process. However, new machine-learning-based PCR models can attain very high accuracy by associating detailed oligonucleotide and DNA template information with empirical results. For example, Kronenberger et al. (2022) trained a random forest classifier to achieve 97% accuracy at predicting the results of 144 unique specificity tests. Using this approach, one can minimize incorrect predictions of assay specificity (false negative model errors) by lowering the classification threshold, above which templates are predicted to cross-amplify, and in vitro testing is recommended. This safeguard can make predictions of specificity (the true negative rate) 100% accurate while maintaining significant reductions in in vitro testing. When PCR model predictions are accurate enough to be relied upon as tests sensu stricto, they not only streamline assay development but also enable developers to leverage public sequence repositories for comprehensive testing of many individuals from many locations per taxon, yielding more-specific assays.

Consequently, fundamental to sound large-scale eDNA sampling initiatives are (1) improved PCR models and (2) fully validated, targeted assays (e.g., Kelly et al., 2023). We addressed these needs here—further testing a machine-learning-based PCR model (eDNAssay; Kronenberger et al., 2022) and using it to validate 46 probe-based qPCR assays targeting invasive species of high priority in the United States, including an amphibian, crustaceans, fishes, mammals, mollusks, plants, and a reptile. Thirty-two assays were newly developed; 12 were previously published; and two were previously published but modified to improve specificity. Levels of assay validation and specificity were characterized for each of eight federal geographic regions, as established by the U.S. Fish and Wildlife Service (USFWS). State-level specificity information is available online at https://nationalgenomicscenter.shinyapps.io/state-level-specificity. Detailed descriptions of each assay are provided in Appendix S1: Assay Documentation. We discuss the merits and shortcomings of this enhanced, in silico, geographically subdivided validation framework. Environmental DNA practitioners may draw upon our methods to develop new assays and revalidate existing assays for use in broad areas.

2 MATERIALS AND METHODS

2.1 Geographic and taxonomic scope

All assays were validated for use throughout the continental United States (Figure 1), including previously published assays that were initially developed for narrower geographic applications. Nontarget taxa were defined as all freshwater and terrestrial confamilial species potentially occurring within this area, along with any sympatric extrafamilial taxa that were identified as potentially cross-amplifying via a Primer-BLAST search of GenBank and also had eDNAssay assignment probabilities ≥0.3 (as described below). We referred to primary literature and online databases for taxonomic assignments, including FishBase (https://www.fishbase.se) and the Integrated Taxonomic Information System (ITIS; https://www.itis.gov). State-level occurrence data were obtained from NatureServe (https://explorer.natureserve.org) and categorized into USFWS regions using a custom script in R (version 4.0.4; R Core Team, 2021). Note that regional borders coincide with state borders everywhere except the Klamath River Basin in southern Oregon and the Sheldon National Wildlife Refuge in northern Nevada; given our approach, these areas were assigned to Regions 1 and 8, respectively. Additional occurrence data were obtained from primary literature and online databases such as the Global Biodiversity Information Facility (https://www.gbif.org) and the Nonindigenous Aquatic Species database (https://nas.er.usgs.gov) as needed for complete documentation.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Geographic scale of assay validation, comprising all U.S. Fish and Wildlife Service (USFWS) regions on the North American continent.

2.2 Assay design and optimization

Of the 46 total assays described here, 32 were newly developed, 12 were previously published, and two were previously published but modified to improve specificity (Table 1). Our newly developed assays were designed using one of two approaches. Three newly developed assays—green sunfish, channel catfish, and lake trout (see Table 1 for Latin names)—were designed prior to the availability of accurate in silico testing methods and followed a conventional, mismatch-based design process (sensu Wilcox et al., 2015). Specifically: (1) target and nontarget sequences were downloaded from GenBank (Sayers et al., 2019) using a custom script in R with package rentrez (Winter, 2017) and aligned using the ClustalW (Thompson et al., 1994) and MUSCLE (Edgar, 2004) algorithms in MEGA (version 11; Tamura et al., 2021); (2) candidate primer pairs were generated using the R package DECIPHER (Wright et al., 2012); (3) probes were inserted between primers in regions where target sequences were most conserved and divergent from those of nontarget taxa; and (4) sequences were modified and evaluated iteratively using base-pair mismatch counts to maximize generality and specificity. The design of the remaining 26 newly developed assays relied heavily on the eDNAssay classifier (Kronenberger et al., 2022), a machine-learning-based PCR model that accurately predicts qPCR cross-amplification using sequence alignment. All newly developed assays were located within the most commonly sequenced mitochondrial genes (COI and Cytb) to allow for more comprehensive in silico testing. For these assays: (1) sequences were downloaded and aligned as above, (2) alignments were visually inspected to identify areas where target sequences were most conserved and divergent from those of nontarget taxa, and (3) candidate primer and probe sequences were added, modified, and evaluated iteratively using eDNAssay to maximize generality and specificity. The best-performing assays were selected for further testing. Melting temperatures were determined using Primer Express 3.0.1 (Life Technologies). Primer melting temperatures were kept between 56 and 60°C, and primer pairs differed by <2°C whenever possible. Probes were designed to have melting temperatures ~10°C higher than primers. Secondary structures were evaluated using OligoAnalyzer (Integrated DNA Technologies; IDT), and potentially problematic dimers were avoided whenever possible.

TABLE 1. Assay information—including the target, oligonucleotide sequences (F, forward primer; R, reverse primer; P, probe), optimized primer concentrations, and developer. All probes have minor-groove binding (MGB).

Target	Oligonucleotide sequence (5′–3′)	F:R (nM)	Developer
AMPHIBIANS
American bullfrog (Rana catesbeiana)	F: TTTTCACTTCATCCTCCCGTTT	900: 900	Strickler et al. (2015)
	R: GGGTTGGATGAGCCAGTTTG
	P: TTATCGCAGCAGCAAGT-MGB
CRUSTACEANS
Virile crayfish—A (Faxonious deanae, nais, virilis)	F: TGATTACTTCCTTTYTCTTTAACTTTGTTG	900: 600	NGC
	R: AATACCTAAATCTACTGATGCCCCTG
	P: ACCTACTCCTCTTTCGAC-MGB
Virile crayfish—B (Faxonious deanae, nais, virilis)	F: CCGAGTAGAGTTAGGYCAGCCA	900: 900	NGC
	R: AAGGAATTAACCAGTTCCCGAAA
	P1: TCGTCCCCAATCAA-MGB
	P2: TCATCCCCAATTAATCT-MGB
Ringed crayfish—A (Faxonious neglectus chaenodactylus clade A4)	F: ACGGGCTGCTGGGATAACTATA	900: 900	NGC
	R: CCAAAACAGGCAAAGATAATAATAATAGC
	P: AACGGTATGCGGTCTAT-MGB
Ringed crayfish—B (Faxonious neglectus neglectus)	F: TATTTTAGGGTCAGTTAATTTTATAACAACG	900: 900	NGC
	R: CATACGATCCATAGTTATCCCCAC
	P: AACATGCGTKCTGTGG-MGB
Rusty crayfish (Faxonius rusticus)	F: GGTGGACAGTGTATCCTCCTCTC	900: 600	Dougherty et al. (2016) and NGC
	R: CATTCGATCTATAGTCATTCCCGTAG
	P: TGAGCCAAGAATAGAAGAA-MGB
Red swamp crayfish (Procambarus clarkii)	F: TTTTCTCTACATTTAGCAGGTGTATCTTCT	600: 600	NGC
	R: AAATAACGGYATTCGATCCATG
	P: CGAACAGTAGGGATAACC-MGB
Marbled crayfish (Procambarus fallax, virginalis)	F: ATTATTAACTAGAGGTATAGTTGAGAGGGGA	900: 900	NGC
	R: AATAGAAGATACACCTGCTAAATGCAAG
	P: CACCCAGTTCCTACTC-MGB
Opossum shrimp (Mysis diluviana)	F: CCAGTGTTAGCAGGGGCTAT	600: 600	Carim, Christianson, et al. (2016)
	R: CCCACCTACAGGGTCAAAGA
	P: TTTAACAGACCGTAATTTAA-MGB
FISHES
Green sunfish (Lepomis cyanellus)	F: TCTCATTATGACATCCTCAACTTTTCTAGT	600: 900	NGC
	R: AGACCTCCGAGGGAGAGAAGG
	P: AGTCAACTAACATCAATACC-MGB
Smallmouth, spotted bass (Micropterus dolomieu, punctulatus)	F: CAGCTATTTCCCAGTATCAGACACC	600: 600	Franklin et al. (2018)
	R: TTGAGGTTTCGATCCGTAAGRA
	P: TTATCGCTCCCAGTCCT-MGB
Northern, blotched snakehead (Channa argus, maculata)	F: CAACATAAGCTTCTGACTTCTTCCC	100: 900	NGC
	R: GACTGGTAGTGAGAGAAGCAGRAGG
	P: CTGGGCGCTATTAAT-MGB
Blue tilapia (Oreochromis aureus)	F: CAACATGAGTTTCTGACTCCTCCCT	900: 900	NGC
	R: ATGGGCAAGATTGCCTGCG
	P: CTCATCTGGAGTCGAAGCA-MGB
Mozambique tilapia (Oreochromis mossambicus)	F: CTGGGCCTTCCGTTGACTT	600: 900	NGC
	R: TAGTACTGCGGTAATTAGAACGGATCAT
	P: AAATAGATGACACCCC-MGB
Nile tilapia (Oreochromis niloticus)	F: GCCTCATCTGGAGTCGAAGCA	900: 900	NGC
	R: GTGATAAAATTAATTGCACCTAAAATAGATGAC
	P: CAAGATTGCCTGCGAGC-MGB
Wami tilapia (Oreochromis urolepis)	F: CGGGGTGTCATCCATTTTAGGT	900: 900	NGC
	R: CAGCACTGCGGTAATTAGAACGGAT
	P: AATTTTATCACAACCATTATTAACATA-MGB
Pond loach (Misgurnus anguillicaudatus)	F: GGGCAATTAATTTTATTACTACAACAATTAACATA	900: 900	NGC
	R: CGGGTCAAAAAAGGTGGTGTTTAG
	P: CTGCCGTTCTTCTTTTA-MGB
Goldfish, Prussian carp (Carassius auratus, gibelio)	F: TCTTCCCCCATCATTCCTGT	900: 600	NGC
	R: GAAGTTGATTGCCCCCAGG
	P: CGGAGCTGGCACC-MGB
Crucian carp (Carassius carassius)	F: AAACACCCCTATTTGTTTGATCTGT	900: 900	NGC
	R: AGCTAAAACAGGTAATGATAGGAGAAGA
	P: TTATTACCGCCGTCCTT-MGB
Grass carp (Ctenopharyngodon idella)	F: CACTAATAAAAATCGCCAACGAC	100: 600	NGC
	R: ATCCAAAGTTTCATCATGCAGAG
	P: CTAGTCGATCTTCCCAC-MGB
Common carp, koi (Cyprinus carpio, rubrofuscus)	F: CGCCACAGTAATCACAAACCTC	100: 900	NGC
	R: CACCTCAGATTCATTGGACTAACA
	P: CTGCCGTACCATACAT-MGB
Silver, bighead carp (Hypophthalmichthys molitrix, nobilis)	F: CTTCGCATTCCACTTCCTCCTA	900: 900	NGC
	R: TGATCCTGTTTCGTGTAGAAAGAGR
	P: TCGTCACCGCCGCAA-MGB
Black carp (Mylopharyngodon piceus)	F: CATTCCAAACAAACTAGGAGGAGTA	900: 900	NGC
	R: TTTTGAGGTGTGTAATATTGGCACT
	P: TTGCACTACTATTTTCCATCT-MGB
Northern pike (Esox lucius)	F: CAGCCACAATCCTCCATTTATTATTC	600: 600	Carim et al. (2019)
	R: TGTAGGAGAAGTAGGGATGAAAGGG
	P: CCAGTAGGTATTAACTCTGATG-MGB
Round goby (Neogobius melanostomus)	F: AGGAACCGGGTGGACAGTTT	100: 900	NGC
	R: ATTGTCAAGTCGACGGATGCT
	P: CCTGGCAGGCAACT-MGB
Channel catfish (Ictalurus punctatus)	F: TTAGCCCGCGGAATACAAATC	900: 900	NGC
	R: GGACCAATTAAACAGTGCGGTG
	P: TGGTTCCTCTCTAATCTA-MGB
Sea lamprey (Petromyzon marinus)	F: TTTTTGACTACTTCCGCCCTCT	900: 900	NGC
	R: AGTGTAAGGAAAAGATTGTTAGGTCG
	P: TGTATATCCTCCCTTAGCC-MGB
Western, eastern mosquitofish (Gambusia affinis, holbrooki)	F: GCAGGAACAGGCTGAACTGTC	900: 900	NGC
	R: CCCAGAATAGAGGAGATGCCC
	P: CCATCTTTTCCCTTCACCTAG-MGB
Rainbow trout (Oncorhynchus mykiss)	F: AGTCTCTCCCTGTATATCGTC	300: 600	Wilcox et al. (2015)
	R: GATTTAGTTCATGAAGTTGCGTGAGTA
	P: CCAACAACTCTTTAACCATC-MGB
Atlantic salmon (Salmo salar)	F: TGTTTGAGCTGTATTAGTCACTGCC	900: 900	NGC
	R: GGTATTTAGATTTCGGTCTGTAAGTAGTATG
	P: CTCCCTCCCTGTTCTA-MGB
Brown trout (Salmo trutta)	F: CGCCCGAGGACTCTACTATGGT	600: 300	Carim, Wilcox, et al. (2016)
	R: GGAAGAACGTAGCCCACGAA
	P: CGGAGTCGTACTGCTAC-MGB
Brook trout (Salvelinus fontinalis)	F: CCACAGTGCTTCACCTTCTATTTCTA	900: 900	Wilcox et al. (2013)
	R: GCCAAGTAATATAGCTACAAAACCTAATAGATC
	P: ACTCCGACGCTGACAA-MGB
Lake trout (Salvelinus namaycush)	F: CCGCCATTGACTCTTCCTTAA	300: 900	NGC
	R: GGTTAAGTCCCCCTCATCCC
	P: CTTCTATCARTGCTCGTGGG-MGB
MAMMALS
Nutria (Myocastor coypus)	F: CACTACAACAGCTTTTTCATCAATCAC	600: 600	Akamatsu et al. (2018)^a
	R: TTCCTCGTCCAATGTGGAAGT
	P: TGATTAATCCGTTATATACACGCT-MGB
Feral swine (Sus scrofa)	F: TTCCCTCTTAGGCATCTGCCTAATCT	900: 900	NGC
	R: AACGGTAAATAGTAGGACTACTCCAATGTTT
	P: ACAGCTTTCTCATCAGTTAC-MGB
MOLLUSKS
Asian clam (Corbicula fluminea)	F: TTCCWTTAATGTTAAGGGCTCCTGATAT	600: 900	NGC
	R: GAAGAAATACCCCCTAAATGAAGAGA
	P: AATATTGCTCATTCTGGC-MGB
Quagga mussel (Dreissena bugensis)	F: CTCTTCATATCGGTGGAGCTTC	900: 900	NGC
	R: CAAAGGCACCCGATAAAACTG
	P: AACATGAGGAAATATACGTGCC-MGB
Quagga, zebra mussel (Dreissena bugensis, polymorpha)	F: TGGGGCAGTAAGAAGAAAAAAATAA	900: 900	Gingera et al. (2017)
	R: CATCGAGGTCGCAAACCG
	P: CCGTAGGGATAACAGC-MGB
Zebra mussel (Dreissena polymorpha)	F: CATTTTCTTATACCTTTTATTTTATTAGTGCTTTT	600: 900	NGC
	R: CGGGACAGTTTGAGTAGAAGTATCA
	P: TAGGTTTTCTTCATACTACTGGC-MGB
Big-eared radix (Radix auricularia)	F: GAGTTGGAACTGGYTGAACAGTC	100: 900	NGC
	R: GTAGTAATAAAATTAATAGCTCCTAAAATTCTYGAT
	P: CCTCTTAGRGGGCCAAT-MGB
Golden mussel (Limnoperna fortunei)	F: TTCCATTAATAATAGGGGCAGTAGATTTG	900: 600	NGC
	R: GTTCTATGAGCATCAAAACTAGATAAAGGAGG
	P: CTGGTTGGACAGTTTAT-MGB
New Zealand mud snail (Potamopyrgus antipodarum)	F: TGTTTCAAGTGTGCTGGTTTAYA	600: 900	Goldberg et al. (2013)
	R: CAAATGGRGCTAGTTGATTCTTT
	P: CCTCGACCAATATGTAAAT-MGB
Chinese mystery snail (Cipangopaludina chinensis)	F: CTGGTGGWTCAGTTGATTTAGCT	900: 900	NGC
	R: TAATTACAGTAGTAATAAAATTAACAGCCCC
	P: CTGGTGCRTCTTC-MGB
PLANTS
Brazilian waterweed (Egeria densa)	F: GGTCAATGGCAATTCCTTCTTG	100: 900	Chase et al. (2020) and NGC
	R: GCACCACCCAAATAGAGCAATA
	P: CCATGCCCAATGAGAGT-MGB
Hydrilla (Hydrilla verticillata)	F: TTGCGCGAATATGTAGAACTTGT	900: 600	Matsuhashi et al. (2016)
	R: GCCAAGGTTTTAGCACAGGAAA
	P: ATTATTGTAGTGGATCTTCA-MGB
REPTILES
Burmese python (Python bivittatus)	F: CACCCTAACAACTTCAATACCTCTACTAAT	900: 600	Hunter et al. (2015)
	R: GAGGTTTGTTCAGTGGTTATTTGTTTT
	P: CCAACACTATTATTCCTAGCAAC-MGB

Note: See Appendix S1: Assay Documentation for comprehensive assay-specific information and Appendix S2: Standard Curve Parameters for assay locus, LOD, and LOQ information. Assays specifying NGC (National Genomics Center for Fish and Wildlife Conservation) were newly developed in this study those with citations were previously published and those with NGC and citations were previously published but modified to improve specificity.
^a Additionally validated by Mangan et al. (2023) for use in the United States.

Primer concentrations were optimized independently for each assay. To do so, we amplified target DNA, either extracted from tissue samples or purchased as gBlocks Gene Fragments (IDT), using 16 assay mixes with forward and reverse primer concentrations at all combinations of 100, 300, 600, and 900 nM and TaqMan minor-groove binding probes at 250 nM in the final reaction. Optimal primer concentrations were those that yielded the greatest change in normalized fluorescence (ΔR_n) and the lowest quantification cycle (C_q). Reactions were run on StepOnePlus and QuantStudio 3 Real-Time PCR Systems (Thermo Fisher Scientific) in triplicate. Each well contained 7.5 μL of TaqMan Environmental Master Mix 2.0 (2×; Thermo Fisher Scientific), 4 μL of target DNA, and 0.75 μL of assay mix, diluted to 15 μL with sterile water. The thermal cycling profile was 95°C for 10 min, followed by 45 cycles of 95°C for 15 s and 60°C for 1 min.

Finally, we estimated each assay's limit of detection (LOD; the lowest copy number an assay can detect in 95% of replicates) and limit of quantification (LOQ; the lowest copy number an assay can quantify with a coefficient of variation below 35%) using the curve-fitting method of Klymus et al. (2020). Standard curves were created using five-fold serial dilutions from either 31,250 to 2 copies per reaction or 15,625 to 1 copy per reaction. Six replicates of each dilution level and negative controls were run using the same reaction chemistry and thermal cycling profile as above. When all or nearly all replicates of all dilution levels were amplified for an assay, the curve-fitting method could not be used to determine LOD, and we instead used the discrete threshold method, which limits estimates to the particular standard curve dilutions tested. For simplicity, we report only discrete threshold estimates in Appendix S1: Assay Documentation. See Appendix S2: Standard Curve Parameters for modeled estimates and raw data.

2.3 In silico testing

We tested assays in silico using the eDNAssay classifier and sequences from as many target individuals as possible and at least one sequence from each nontarget taxon represented on GenBank (Figure 2; Appendix S3: GenBank Accessions). If there were many nontargets for a given assay or there was an overabundance of sequence data for a given nontarget, the number of sequences evaluated was truncated (typically 5–10 sequences) for expediency and to facilitate sequence alignment, curation, and error checking. Sequences, primers, and probes were aligned as described above and input into eDNAssay to produce class assignment probabilities for each template. The default threshold for assigning a template to the “amplify” class is 0.5; assignment probabilities greater than 0.5 predict amplification, and those less than 0.5 predict nonamplification. While this threshold produces accurate classifications, it can be lowered to safeguard against costly false negative model errors (i.e., when a template is predicted to not cross-amplify but it does). We utilized the more conservative threshold of 0.3 as recommended by Kronenberger et al. (2022). If the number of sequences was truncated for initial assessment (as described above) and the nontarget taxon had an assignment probability ≥0.3, it was reassessed using additional sequences (typically all available) to more fully evaluate cross-amplification risk.

Cross-amplification is very unlikely for nontarget taxa with maximum assignment probabilities <0.3 and very likely for nontargets with minimum assignment probabilities ≥0.7 (Kronenberger et al., 2022), so these were not necessarily tested in vitro. Nontargets with assignment probabilities between 0.3 and 0.7—unless they were of particularly low specificity concern (i.e., also a nonindigenous invasive species, potentially extirpated, or highly endemic)—were tested in vitro to confirm model predictions. In certain cases, a small subset of target sequences had minimum assignment probabilities <0.7 (low-to-moderate likelihood of amplifying), but the assay was nonetheless deemed valid because these sequences were rare (99 of 4670 target sequences assessed [2%]) and could have arisen from hybridization, species misidentification, or technical issues leading to erroneous sequences on GenBank.

Nontarget taxa without sequence data were tested in vitro whenever possible. Some assays had a substantial number of nontargets, for which we lacked both sequences and tissues. To help understand the risk of cross-amplifying these untested nontargets, when there were >10 confamilials with sequence data, we fitted a beta probability density function to the maximum eDNAssay assignment probabilities. We used the probability density function to predict the probability that an untested confamilial (i.e., one of the nontargets without available sequences) would exceed eDNAssay assignment probability thresholds of 0.3 and 0.5, as described in Wilcox et al. (2024).

The eDNAssay classifier is limited to evaluating sequences input by the user. Therefore, we identified more distantly related taxa that primers may amplify via Primer-BLAST, using the nr database with default settings. Sequences from any taxa that were flagged by Primer-BLAST and could potentially occur within the continental United States were input into eDNAssay as described above. Any Primer-BLAST hits with eDNAssay assignment probabilities ≥0.3 were considered to potentially reduce assay specificity and evaluated like confamilial nontarget taxa.

2.4 In vitro testing

We sought to perform in vitro testing on genomic or synthetic DNA from multiple target individuals and at least one representative of each nontarget taxon that could not be tested in silico or that in silico testing identified as potentially cross-amplifying (Figure 2). Nontarget haplotypes with the highest assignment probabilities were tested whenever possible. We prioritized in vitro testing of nontargets with maximum eDNAssay assignment probabilities ≥0.3 but performed broader testing of templates with known sequences to more fully evaluate model performance (Appendix S4: eDNAssay Performance). Nontargets that could not be tested in silico or in vitro were treated as described above—either by gleaning specificity information from sequenced confamilials (sensu Wilcox et al., 2024) or by adding caveats to the assay application. In cases where multiple species had identical sequences where oligonucleotides bind, all were considered to have been tested in vitro as long as one template of a known sequence was tested.

Genomic DNA was extracted from tissues using DNeasy® Blood and Tissue Kits (QIAGEN, Hilden, Germany) and diluted to 0.1 ng/μL, as determined using a Qubit 2.0 Fluorometer (Thermo Fisher Scientific). Tissue samples either already existed in-house or were obtained from other labs or museum collections (see Contributing Partners in Appendix S1: Assay Documentation for full acknowledgements). When tissue samples were difficult to obtain but sequences were available, we tested synthetic DNA (i.e., gBlocks), quantified using a Qubit 2.0 fluorometer or NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific), and diluted to 5000 copies/reaction. Reactions were run in singlicate, included negative and positive controls, and followed the qPCR protocol described above. A template was considered to have been amplified if there was an unambiguous amplification curve, according to expert opinion, regardless of the software-assigned threshold. Any curve that was particularly depressed (with a low ΔRn) was noted to aid in the interpretation of questionable in situ results.

2.5 In situ testing

Assay performance was validated in situ using environmental samples from areas with known or suspected presence and absence of the target taxon. If a previously published assay was already tested in situ, we typically did not perform additional testing. In most cases, environmental samples were collected via the filtration method of Carim, McKelvey, et al. (2016) and either archived from other sampling initiatives or collected expressly for assay validation. Several assays have been extensively tested during routine field usage, and results can be found on the eDNAtlas (Young et al., 2018). These assays did not necessarily undergo formal in situ testing if detections occurred only in locations with reasonable likelihoods of target taxon occurrence (Appendix S1: Assay Documentation).

For new and previously published but modified assays, eDNA was extracted from filters following Franklin et al. (2019) and run in triplicate along with negative and positive controls using the qPCR protocol described above, but replacing 1.8 μL of water with TaqMan Exogenous Internal Positive Control reagents (IPC; Life Technologies; 1.5 μL of 10× IPC assay and 0.3 μL of 50× IPC DNA per reaction). If a sample did not amplify and its IPC was delayed ≥1 C_q from the negative control IPC, the reaction was considered inhibited and the extract was treated using a OneStep™ PCR Inhibitor Removal Kit (Zymo Research) and reanalyzed. A sample was considered positive if an unambiguous amplification curve was produced in at least one well. All eDNA sample handling and analysis was conducted in dedicated laboratory facilities in accordance with the standard operating procedures used by the eDNA program at the National Genomics Center for Wildlife and Fish Conservation.

3 RESULTS

3.1 Trends in sensitivity

All 46 probe-based qPCR assays (Table 1; Appendix S1: Assay Documentation) have high sensitivity. Standard curves demonstrated low LOD values using both methods described by Klymus et al. (2020): discrete threshold (mean = 5, range = 1–10 copies per reaction) and curve-fitting (mean = 3, range = 2–6 copies per reaction), although the curve-fitting method could not be used for 22 assays because all or nearly all replicates of the lowest standard were amplified (Appendix S2: Standard Curve Parameters). Estimates of LOQ were likewise low (discrete threshold mean = 22, range = 1–250; curve-fitting mean = 16, range = 1–190).

3.2 Trends in generality

We assessed generality in silico for 4670 target sequences (mean = 102, range = 3–625 in silico tests of generality per assay; Figure 3b). Generality was high for all assays, as evidenced by high minimum eDNAssay assignment probabilities (≥0.7) in all but several cases involving uncommon or outlier sequences (n = 34 haplotypes across 13 assays; Figure 3b). Along with the developers of previously published assays, we tested 741 target DNA templates in vitro (mean = 16, range = 1–212 in vitro tests of generality per assay). In vitro testing of target DNA produced strong amplifications, except for 2 of the 31 templates tested with one of the two virile crayfish assays (virile crayfish B), which had unusually low ΔRn values. Both samples were from Wyoming (Green River and Pathfinder Reservoir), and the remaining 19 samples from Wyoming amplified strongly.

Along with the developers of previously published assays, we confirmed amplification in situ of environmental samples from areas with known presence of the target taxon for 42 assays (Appendix S1: Assay Documentation). However, for three of these assays (marbled crayfish, sea lamprey, and feral swine), we did not have access to field samples and instead tested samples from aquaria or captive enclosures containing the target species. We were unable to confirm amplification in situ for four assays (northern/blotched snakehead, Wami tilapia, crucian carp, and golden mussel) because we lacked environmental samples from areas with known presence.

3.3 Trends in specificity

Our search for nontarget taxa—defined as all sympatric confamilials and any sympatric extrafamilials that were flagged by Primer-BLAST and had moderate-to-high cross-amplification risk (eDNAssay assignment probabilities ≥0.3)—identified 0–405 per assay (Figure 3a). Few of the extrafamilial taxa flagged by Primer-BLAST were both sympatric and had moderate-to-high cross-amplification risk: zero for 37 assays, ≤ 3 for eight assays (one of the two virile crayfish assays [virile crayfish A], opossum shrimp, smallmouth/spotted bass, snakehead, silver/bighead carp, black carp, sea lamprey, and feral swine), and 10 for one assay (Chinese mystery snail).

Across assays, there were 1094 nontarget taxa, yielding 5276 unique assay-nontarget taxon combinations. Of these, we tested 4206 (80%), including 3234 (61%) in silico only, and, along with the developers of previously published assays, 972 (18%) in vitro (Figure 4). Paired in silico and in vitro testing was accomplished for 916 assay-nontarget combinations. Specificity was high, with typically low nontarget eDNAssay assignment probabilities (mean of maximum assignment probabilities, 0.16 ± 0.10 SD; Figure 3b). Of 4150 assay-nontarget combinations tested in silico, 292 (7%) had moderate-to-high cross-amplification risk (eDNAssay assignment probabilities ≥0.3; Figure 3b). We tested 236 (81%) in vitro and, of these, 45 (19%) cross-amplified. Fifty-six assay-nontarget combinations were only tested in vitro because we lacked sequence data, and two (4%) cross-amplified. Cross-amplifications were rare overall, observed in 47 (1%) of the 4206 tested assay-nontarget combinations, distributed among 15 assays (Figure 3b,c). Cross-amplifying nontarget taxa were widespread (affecting most USFWS regions) for the American bullfrog, one of two virile crayfish assays (virile crayfish A), tilapia, and crucian carp assays (Figure 3c). The four tilapia assays primarily cross-amplified one another's target species, and the crucian carp assay only cross-amplified goldfish and Prussian carp, species that are also nonindigenous in the United States and may also be of interest to wildlife managers. Many cross-amplifications were “slight” (i.e., with severely depressed ΔRn values), as indicated in Appendix S1: Assay Documentation, and therefore are unlikely to be incorrectly deemed detections in practice.

We focused our in vitro testing efforts on nontarget taxa with moderate-to-high cross-amplification risk (eDNAssay assignment probabilities ≥0.3), but also tested nontargets with low cross-amplification risk (eDNAssay assignment probabilities <0.3) when practical to evaluate performance of the eDNAssay classifier. We only used synthetic gene fragments for this to eliminate sequence uncertainty. In total, we tested 649 unique assay-template combinations (Figure 5; Appendix S4: eDNAssay Performance). In silico predictions from the eDNAssay classifier were 96% accurate, similar to the accuracy reported by Kronenberger et al. (2022). Because Kronenberger et al. (2022) tuned their model to maximize recall and reduce costly false negative errors—that is, cross-amplification of a template predicted to not cross-amplify—accuracy was much higher for predictions of assay specificity (the true negative rate; 611 of 626 correct; 98%) than for predictions of cross-amplification (the true positive rate; 12 of 23 correct; 52%). Importantly, these results were produced using the default classification threshold of 0.5, but our cutoff for in vitro testing was 0.3; under this more conservative threshold, predictions of specificity were 100% accurate.

We identified 1173 “potentially problematic” assay-nontarget combinations—those that either were not tested in silico or in vitro (n = 1070; 91%), had eDNAssay assignment probabilities ≥0.3 but were not tested in vitro (n = 66; 6%; mean of maximum assignment probabilities, 0.45 ± 0.16 SD), or cross-amplified in vitro (n = 37; 3%; Figures 3 and 4c). In some cases, we decided not to test nontarget taxa with moderate-to-high assignment probabilities in vitro if they were highly endemic or rare (as detailed in Appendix S1: Assay Documentation) and therefore any specificity issues would be uncommon. Of the 1173 potentially problematic assay-nontarget combinations, 849 (72%) were associated with the seven crayfish assays and 143 (12%) with the six carp assays. There were typically few potentially problematic nontarget taxa per assay per USFWS region (Figure 3c); of 368 total assay-USFWS region combinations, the number of potentially problematic nontarget taxa was zero for 179 (49%), ≤2 for 279 (76%) and ≤10 for 342 (93%). A notable exception is the crayfish assays when applied in Region 4, where many nontargets are potentially problematic (n = 93–116 per assay; Figure 3c) due to very high biodiversity and a lack of sequence information. Excluding the crayfish assays in Region 4, there were 0–32 potentially problematic nontarget taxa per assay per USFWS region.

As described above, a large majority of potentially problematic assay-nontarget combinations (91%) were designated as such because they were not tested in silico or in vitro. Testing was not possible in these cases because we lacked both sequences and DNA. Therefore, for assays with >10 confamilials tested in silico (n = 33), we used available eDNAssay assignment probabilities to estimate the cross-amplification risk posed by confamilials that were not tested. The cross-amplification risk was generally low. If we fit a probability density function for each assay using the beta distribution (Sensu Wilcox et al., 2024), the likelihood of an untested confamilial exceeding the 0.3 conservative classification threshold is ≤27% in all cases except for smallmouth/spotted bass (44% chance) and the four tilapia assays (35–55% chance). The likelihood of an untested confamilial exceeding the 0.5 default classification threshold is ≤1% in all cases except one of the two virile crayfish assays (virile crayfish A; 4% chance) and the four tilapia assays (5–22% chance). However, high likelihoods for the tilapia assays stem from congeneric species that sometimes hybridize with the targets and are also nonindigenous. With congeners removed, the likelihood of an untested confamilial exceeding the 0.3 threshold is ≤37%, and the likelihood of exceeding the 0.5 threshold is ~0%. Furthermore, along with the developers of previously published assays, we confirmed nonamplification in situ of environmental samples from areas with suspected absence of the target taxon for all assays.

3.4 Assay-specific reporting

To enable efficient use of the 46 assays, descriptions in Appendix S1: Assay Documentation follow a structured format that provides detailed information on (1) oligonucleotide sequences, optimal primer concentrations, and LOD; (2) target taxonomy and relevant biology; (3) occurrence information for all target and nontarget taxa within each USFWS region; (4) the results of all in silico, in vitro, and in situ testing; and (5) special use considerations given the identity and geographic distribution of untested or cross-amplifying nontarget taxa. This structured format is meant to clearly articulate the extent of assay validation and facilitate future updates as additional sequence data and DNA templates become available. See How to Use This Document in Appendix S1: Assay Documentation for a hypothetical example and Troubleshooting for more information about dealing with sequence and sample missingness and advice for interpreting questionable amplifications. State-level nontarget specificity information can be found at https://nationalgenomicscenter.shinyapps.io/state-level-specificity.

4 DISCUSSION

For targeted species detection at national, continental, or global scales, eDNA assays must be taxonomically specific at those scales. We demonstrated an enhanced in silico, geographically subdivided validation framework that can increase assay reliability over broad areas and reduce costs. When validating 46 probe-based qPCR assays targeting invasive species throughout the continental United States, we estimate that our ability to accurately predict cross-amplification in silico, rather than relying exclusively on in vitro testing, saved hundreds of thousands of dollars and years of development time. For example, we validated seven assays targeting crayfishes in family Cambaridae. At the time of writing, we estimate this family has 403 species in the continental United States, many of which are legally protected, highly endemic, or newly described (Crandall & De Grave, 2017; Taylor et al., 2007). Obtaining tissues or synthetic DNA for all or even most of these species would be enormously resource and time intensive. It would not be feasible with the budgets and timeframes available for our program. However, with perfect accuracy of the eDNAssay classifier (Kronenberger et al., 2022) below a conservative classification threshold of 0.3 (Figure 3), we could responsibly forego in vitro testing of an average of 266 crayfish species per assay (66%). Savings were even greater for our six carp assays due to lower sequence missingness; of 295 species in the family Cyprinidae (including those recategorized into Leuciscidae by Tan & Armbruster, 2018), we no longer needed to test an average of 260 (88%). However, we still tested many of these species to evaluate model performance, so reductions in in vitro testing are likely to be even more pronounced during routine use than they were in this study.

Levels of validation and specificity were high overall (Figure 3; Appendix S1: Assay Documentation). Validation and specificity were complete for most assays targeting species with few or no confamilials in the continental United States (i.e., snakehead, pond loach, northern pike, nutria, feral swine, Asian clam, quagga and zebra mussels, golden mussel, New Zealand mud snail, and Burmese python). For assays targeting biodiverse taxonomic groups (e.g., crayfishes and carps), communicating the levels of validation and specificity independently for each USFWS region (Figure 3c; Appendix S1: Assay Documentation) is meant to facilitate responsible use. For example, 258 (65%) of the 403 crayfish species assigned to family Cambaridae in the continental United States occur exclusively within Region 4 in the southeastern portion of the country. Levels of validation and specificity are lower for these assays in this region as a result (Figure 3c) and their use should proceed with the utmost caution. However, the crayfish assays come with few or no caveats in regions with relatively low family representation (particularly Regions 1, 6, 7, and 8) and may be used with little or no additional testing. Appendix S1: Assay Documentation articulates special use considerations for each assay, and we recommend referring to it as the first step in deciding whether an assay is appropriate for your particular project.

We observed geographically widespread cross-amplification from the American bullfrog, one of two virile crayfish assays (virile crayfish A), tilapia, and crucian carp assays (Figure 3c). Regardless, we believe nationwide use of these assays is feasible with consideration of performance details (Appendix S1). For example, the four congeners cross-amplified by the American bullfrog assay are found exclusively or primarily within the target's native range and therefore would pose few issues in invaded areas where the assay is likely to be applied. We dealt with the low specificity of one virile crayfish assay (virile crayfish A) by developing a second virile crayfish assay (virile crayfish B), which may have lower generality but greater specificity. When used in conjunction (e.g., initially running samples with Assay A and confirming detections with Assay B) the assays are expected to detect all target haplotypes without generating false positive inferences due to cross-amplification of nontarget taxa. For the tilapia and crucian carp assays, cross-amplifying species were almost exclusively congeners that are likewise nonindigenous and may therefore also be of interest to wildlife managers. Species-level detections can be corroborated in these cases by comparing the morphology of sample amplification curves to that of positive controls—either qualitatively (e.g., “by eye”) or quantitatively (e.g., Patrone et al., 2020). Finally, two assays had apparently high specificity but notably limited testing: the big-eared radix and Chinese mystery snail (Figure 3a). However, fitting a probability density function for these assays (sensu Wilcox et al., 2024) indicates that the eDNAssay assignment probability of an untested confamilial has a very low chance (≤6%) of being ≥0.3 and a near-zero chance of being ≥0.5 (Appendix S1: Assay Documentation). We still recommend caution, though, given the possibility of untested outliers with unusual genetic similarity to targets where oligonucleotides bind. Particular caution is warranted for the Chinese mystery snail assay because we identified 10 insect species via Primer-BLAST that are both sympatric and had eDNAssay assignment probabilities ≥0.3. This suggests the assay targets a gene region that is relatively conserved for insects, a taxonomic group that is both specious and incompletely described. Users might consider the Sanger sequencing of questionable amplicons when definitive confirmation is warranted.

Our use of in silico testing in this study is not unique; assay development has long incorporated in silico PCR models rooted in base-pair mismatch characteristics and thermodynamic estimates of binding affinity (Noguera et al., 2014). What is unique is the extent to which we were able to rely upon in silico predictions as tests sensu stricto, not merely as prerequisites for in vitro testing. This was made possible through machine learning—a statistical method able to attain high predictive accuracy by associating sequence data with empirical results to distill complex relationships among suites of features. However, machine-learning-based PCR models come with an important caveat: performance may be lower if the reaction conditions used to generate training data (e.g., reagent concentrations, probe moiety, cycle number) are not mimicked during testing. The extent to which this is an issue is unknown and deserves investigation. That said, many of the assays we used to test eDNAssay have different parameters than the assays used by Kronenberger et al. (2022) to train eDNAssay—including degenerate bases, sense-strand-binding probes, and a wider range of melting temperatures (Appendix S4: eDNAssay Performance)—and accuracy remained high. Also, uncertainty around the universality of in silico testing results is not an entirely foreign problem; the reproducibility of in vitro tests is also affected by reaction conditions (Stadhouders et al., 2010), and yet, in our experience, assays are rarely revalidated for use with new protocols. There seems to be an underlying assumption that any variation stemming from reaction conditions is probably inconsequential, perhaps due to safeguards against false positive detections (e.g., fluorescence thresholds). Similar safeguards exist against false negative model errors (e.g., lowered classification thresholds). We recommend liberal use of safeguards with eDNAssay or any machine-learning-based PCR model, particularly when predicting specificity under novel reaction conditions. We also remind readers that, whatever the in silico prediction, in vitro testing remains the “gold standard” for demonstrating assay specificity against a particular template under particular reaction conditions. It continues to be a valuable component of assay validation even if not strictly required under an enhanced in silico testing framework. The decision of how much in vitro testing to complete may be directed by in silico predictions, but also depends on funding, development timelines, and tolerance for false positive eDNA detections, all of which are project-dependent.

Careful attention to the details of an assay's validation is vital for ensuring its readiness (Langlois et al., 2021; Thalinger et al., 2021). When the scale of assay application is geographically large, we argue that one detail is particularly consequential: the location for which an assay was initially validated. Specificity is not guaranteed in different locations. It is the responsibility of assay developers to clearly communicate the testing done and for users to conduct further testing as needed. Biogeography is a critical consideration when ensuring and communicating assay specificity over broad areas.

AUTHOR CONTRIBUTIONS

JAK: methodology, data acquisition, data analysis, results interpretation, writing (original draft), and writing (reviewing and editing). TMW: study conception, funding acquisition, methodology, results interpretation, and writing (reviewing and editing). MKY: methodology, results interpretation, and writing (reviewing and editing). DHM: methodology, data acquisition, data analysis, results interpretation, and writing (reviewing and editing). TWF: methodology, results interpretation, and writing (reviewing and editing). MKS: methodology, results interpretation, and writing (reviewing and editing).

ACKNOWLEDGMENTS

This work was funded by the U.S. Forest Service's Rocky Mountain Research Station and the Department of Defense's Environmental Security Technology Certification Program (ESTCP RC21-5121) and was only possible due to the generous contribution of samples and expertise from dozens of professionals, agencies, and organizations. Caleb Dysthe and Amanda Mast (NGC) developed some of the assays, and Michael Sleeting (NGC) helped with laboratory analysis. Chris Merkes (USGS Upper Midwest Environmental Sciences Center) helped to interpret the modeled LOD and LOQ results. Influential feedback on our methods was provided by Cathy Richter (USGS Columbia Environmental Research Center) and Eric Larson (University of Illinois). See Contributing Partners in Appendix S1: Assay Documentation for full acknowledgments.

CONFLICT OF INTEREST STATEMENT

We have no conflict of interest to report.

Open Research

DATA AVAILABILITY STATEMENT

Assay-specific background, testing, and use information are available in Appendix S1: Assay Documentation; assay efficiency and LOD and LOQ estimates are available in Appendix S2: Standard Curve Parameters; accession numbers for GenBank sequences tested in silico are available in Appendix S3: GenBank Accessions; data used to test the performance of the eDNAssay classifier are available in Appendix S4: eDNAssay Performance; and state-level occurrence specificity information for each assay is available at https://nationalgenomicscenter.shinyapps.io/state-level-specificity.

Supporting Information

REFERENCES

Akamatsu, Y., Goto, M., Inui, R., Yamanaka, H., Komuro, T., & Kono, Y. (2018). Monitoring of Myocastor coypus in Yamaguchi prefecture. Ecology and Civil Engineering, 21(1), 1–8. https://doi.org/10.3825/ece.21.1
10.3825/ece.21.1
Google Scholar
Carim, K. J., Christianson, K. R., McKelvey, K. M., Pate, W. M., Silver, D. B., Johnson, B. M., Galloway, B. T., Young, M. K., & Schwartz, M. K. (2016). Environmental DNA marker development with sparse biological information: A case study on opossum shrimp (Mysis diluviana). PLoS One, 11(8), e0161664. https://doi.org/10.1371/journal.pone.0161664
10.1371/journal.pone.0161664
PubMed Web of Science® Google Scholar
Carim, K. J., Dysthe, J. C., McLellan, H., Young, M. K., McKelvey, K. S., & Schwartz, M. K. (2019). Using environmental DNA sampling to monitor the invasion of nonnative Esox lucius (northern pike) in the Columbia River basin, USA. Environmental DNA, 1(3), 215–226. https://doi.org/10.1002/edn3.22
10.1002/edn3.22
Google Scholar
Carim, K. J., McKelvey, K. S., Young, M. K., Wilcox, T. M., & Schwartz, M. K. (2016). A protocol for collecting environmental DNA samples from streams. General Technical Report RMRS-GTR-355. U.S. Department of Agriculture, Forest Service, Rocky Mountain Research Station.
10.2737/RMRS-GTR-355
Google Scholar
Carim, K. J., Wilcox, T. M., Anderson, M., Lawrence, D. J., Young, M. K., McKelvey, K. S., & Schwartz, M. K. (2016). An environmental DNA marker for detecting nonnative brown trout (Salmo trutta). Conservation Genetics Resources, 8(3), 259–261. https://doi.org/10.1007/s12686-016-0548-5
10.1007/s12686-016-0548-5
Web of Science® Google Scholar
Chase, D. M., Kuehne, L. M., Olden, J. D., & Ostberg, C. O. (2020). Development of a quantitative PCR assay for detecting Egeria densa in environmental DNA samples. Conservation Genetics Resources, 12, 545–548. https://doi.org/10.1007/s12686-020-01152-w
10.1007/s12686-020-01152-w
Web of Science® Google Scholar
Crandall, K. A., & De Grave, S. (2017). An updated classification of the freshwater crayfishes (Decapoda: Astacidea) of the world, with a complete species list. Journal of Crustacean Biology, 37(5), 615–653. https://doi.org/10.1093/jcbiol/rux070
10.1093/jcbiol/rux070
Web of Science® Google Scholar
Dougherty, M. M., Larson, E. R., Renshaw, M. A., Gantz, C. A., Egan, S. P., Erickson, D. M., & Lodge, D. M. (2016). Environmental DNA (eDNA) detects the invasive rusty crayfish Orconectes rusticus at low abundances. Journal of Applied Ecology, 53(3), 722–732. https://doi.org/10.1111/1365-2664.12621
10.1111/1365-2664.12621
CAS PubMed Web of Science® Google Scholar
Edgar, R. C. (2004). MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797. https://doi.org/10.1093/nar/gkh340
10.1093/nar/gkh340
CAS PubMed Web of Science® Google Scholar
Franklin, T. W., Dysthe, J. C., Rubenson, E. S., Carim, K. J., Olden, J. D., McKelvey, K. S., Young, M. K., & Schwartz, M. K. (2018). A non-invasive sampling method for detecting non-native smallmouth bass (Micropterus dolomieu). Northwest Science, 92(2), 149–157. https://doi.org/10.3955/046.092.0207
10.3955/046.092.0207
Web of Science® Google Scholar
Franklin, T. W., McKelvey, K. S., Golding, J. D., Mason, D. H., Dysthe, J. C., Pilgrim, K. L., Squires, J. R., Aubry, K. B., Long, R. A., Greaves, S. E., Raley, C. M., Jackson, S., MacKay, P., Lisbon, J., Sauder, J. D., Pruss, M. T., Heffington, D., & Schwartz, M. K. (2019). Using environmental DNA methods to improve winter surveys for rare carnivores: DNA from snow and improved noninvasive techniques. Biological Conservation, 229, 50–58. https://doi.org/10.1016/j.biocon.2018.11.006
10.1016/j.biocon.2018.11.006
Web of Science® Google Scholar
Gingera, T. D., Bajno, R., Docker, M. F., & Reist, J. D. (2017). Environmental DNA as a detection tool for zebra mussels Dreissena polymorpha (Pallas, 1771) at the forefront of an invasion event in Lake Winnipeg, Manitoba, Canada. Management of Biological Invasions, 8(3), 287–300. https://doi.org/10.3391/mbi.2017.8.3.03
10.3391/mbi.2017.8.3.03
Web of Science® Google Scholar
Goldberg, C. S., Sepulveda, A., Ray, A., Baumgardt, J., & Waits, L. P. (2013). Environmental DNA as a new method for early detection of New Zealand mudsnails (Potamopyrgus antipodarum). Freshwater Science, 32(3), 792–800. https://doi.org/10.1899/13-046.1
10.1899/13-046.1
Web of Science® Google Scholar
Hunter, M. E., Oyler-McCance, S. J., Dorazio, R. M., Fike, J. A., Smith, B. J., Hunter, C. T., Reed, R. N., & Hart, K. M. (2015). Environmental DNA (eDNA) sampling improves occurrence and detection estimates of invasive Burmese pythons. PLoS One, 10(4), e0121655. https://doi.org/10.1371/journal.pone.0121655
10.1371/journal.pone.0121655
PubMed Web of Science® Google Scholar
Kelly, R. P., Lodge, D. M., Lee, K. N., Theroux, S., Sepulveda, A. J., Scholin, C. A., Craine, J. M., Allan, E. A., Nichols, K. M., Parsons, K. M., Goodwin, K. D., Gold, Z., Chavez, F. P., Noble, R. T., Abbott, C. L., Baerwald, M. R., Naaum, A. M., Thielen, P. M., Simons, A. L., … Weisberg, S. B. (2023). Toward a national eDNA strategy for the United States. Environmental DNA, 6, 432. https://doi.org/10.1002/edn3.432
10.1002/edn3.432
Google Scholar
Klymus, K. E., Merkes, C. M., Allison, M. J., Goldberg, C. S., Helbing, C. C., Hunter, M. E., Jackson, C. A., Lance, R. F., Mangan, A. M., Monroe, E. M., Piaggio, A. J., Stokdyk, J. P., Wilson, C. C., & Richter, C. A. (2020). Reporting the limits of detection and quantification for environmental DNA assays. Environmental DNA, 2(3), 271–282. https://doi.org/10.1002/edn3.29
10.1002/edn3.29
Google Scholar
Kronenberger, J. A., Wilcox, T. M., Mason, D. H., Franklin, T. W., McKelvey, K. S., Young, M. K., & Schwartz, M. K. (2022). eDNAssay: A machine learning tool that accurately predicts qPCR cross-amplification. Molecular Ecology Resources, 22(8), 2994–3005. https://doi.org/10.1111/1755-0998.13681
10.1111/1755-0998.13681
CAS PubMed Web of Science® Google Scholar
Langlois, V. S., Allison, M. J., Bergman, L. C., To TA, & Helbing, C. C. (2021). The need for robust qPCR-based eDNA detection assays in environmental monitoring and species inventories. Environmental DNA, 3(3), 519–527. https://doi.org/10.1002/edn3.164
10.1002/edn3.164
CAS Google Scholar
Laschever, E., Kelly, R., Hoge, M., & Lee, K. (2023). The next generation of environmental monitoring: Environmental DNA in agency practice. Columbia Journal of Environmental Law, 48, 260–360. https://doi.org/10.52214/cjel.v48iS.11038
10.52214/cjel.v48iS.11038
Google Scholar
Lodge, D. M. (2022). Policy action needed to unlock eDNA potential. Frontiers in Ecology and the Environment, 20(8), 448–449. https://doi.org/10.1002/fee.2563
10.1002/fee.2563
Web of Science® Google Scholar
Mangan, A. M., Kronenberger, J. A., Plummer, I. H., Wilcox, T. M., & Piaggio, A. J. (2023). Validation of a nutria (Myocastor coypus) environmental DNA assay highlights considerations for sampling methodology. Environmental DNA, 5(3), 391–402. https://doi.org/10.1002/edn3.412
10.1002/edn3.412
CAS Google Scholar
Matsuhashi, S., Doi, H., Fujiwara, A., Watanabe, S., & Minamoto, T. (2016). Evaluation of the environmental DNA method for estimating distribution and biomass of submerged aquatic plants. PLoS One, 11(6), e0156217. https://doi.org/10.1371/journal.pone.0156217
10.1371/journal.pone.0156217
PubMed Web of Science® Google Scholar
McCarthy, A., Rajabi, H., McClenaghan, B., Fahner, N. A., Porter, E., Singer, G. A., & Hajibabaei, M. (2023). Comparative analysis of fish environmental DNA reveals higher sensitivity achieved through targeted sequence-based metabarcoding. Molecular Ecology Resources, 23(3), 581–591. https://doi.org/10.1111/1755-0998.13732
10.1111/1755-0998.13732
CAS PubMed Web of Science® Google Scholar
Nathan, L. M., Simmons, M., Wegleitner, B. J., Jerde, C. L., & Mahon, A. R. (2014). Quantifying environmental DNA signals for aquatic invasive species across multiple detection platforms. Environmental Science & Technology, 48(21), 12800–12806. https://doi.org/10.1021/es5034052
10.1021/es5034052
CAS PubMed Web of Science® Google Scholar
Noguera, D. R., Wright, E. S., Camejo, P., & Yilmaz, L. S. (2014). Mathematical tools to optimize the design of oligonucleotide probes and primers. Applied Microbiology and Biotechnology, 98(23), 9595–9608. https://doi.org/10.1007/s00253-014-6165-ss
10.1007/s00253-014-6165-x
CAS PubMed Web of Science® Google Scholar
Patrone, P. N., Romsos, E. L., Cleveland, M. H., Vallone, P. M., & Kearsley, A. J. (2020). Affine analysis for quantitative PCR measurements. Analytical and Bioanalytical Chemistry, 412, 7977–7988. https://doi.org/10.1007/s00216-020-02930-z
10.1007/s00216-020-02930-z
CAS PubMed Web of Science® Google Scholar
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
Google Scholar
Rees, H. C., Maddison, B. C., Middleditch, D. J., Patmore, J. R., & Gough, K. C. (2014). The detection of aquatic animal species using environmental DNA—A review of eDNA as a survey tool in ecology. Journal of Applied Ecology, 51(5), 1450–1459. https://doi.org/10.1111/1365-2664.12306
10.1111/1365-2664.12306
CAS Web of Science® Google Scholar
Sayers, E. W., Cavanaugh, M., Clark, K., Ostell, J., Pruitt, K. D., & Karsch-Mizrachi, I. (2019). GenBank. Nucleic Acids Research, 48, D84–D86. https://doi.org/10.1093/nar/gkz956
10.1093/nar/gkz956
Web of Science® Google Scholar
Schenekar, T. (2023). The current state of eDNA research in freshwater ecosystems: Are we shifting from the developmental phase to standard application in biomonitoring? Hydrobiologia, 850(6), 1263–1282. https://doi.org/10.1007/s10750-022-04891-z
10.1007/s10750-022-04891-z
Web of Science® Google Scholar
Sepulveda, A. J., Nelson, N. M., Jerde, C. L., & Luikart, G. (2020). Are environmental DNA methods ready for aquatic invasive species management? Trends in Ecology & Evolution, 35(8), 668–678. https://doi.org/10.1016/j.tree.2020.03.011
10.1016/j.tree.2020.03.011
PubMed Web of Science® Google Scholar
So, K. Y. K., Fong, J. J., Lam, I. P. Y., & Dudgeon, D. (2020). Pitfalls during in silico prediction of primer specificity for eDNA surveillance. Ecosphere, 11(7), e03193. https://doi.org/10.1002/ecs2.3193
10.1002/ecs2.3193
Web of Science® Google Scholar
Stadhouders, R., Pas, S. D., Anber, J., Voermans, J., Mes, T. H., & Schutten, M. (2010). The effect of primer-template mismatches on the detection and quantification of nucleic acids using the 5′ nuclease assay. Journal of Molecular Diagnostics, 12(1), 109–117. https://doi.org/10.2353/jmoldx.2010.090035
10.2353/jmoldx.2010.090035
CAS PubMed Web of Science® Google Scholar
Strickler, K. M., Fremier, A. K., & Goldberg, C. S. (2015). Quantifying effects of UV-B, temperature, and pH on eDNA degradation in aquatic microcosms. Biological Conservation, 183, 85–92. https://doi.org/10.1016/j.biocon.2014.11.038
10.1016/j.biocon.2014.11.038
Web of Science® Google Scholar
Takahashi, M., Saccò, M., Kestel, J. H., Nester, G., Campbell, M. A., Van Der Heyde, M., Heydenrych, M. J., Juszkiewicz, D. J., Nevill, P., Dawkins, K. L., Bessey, C., Fernandes, K., Miller, H., Power, M., Mousavi-Derazmahalleh, M., Newton, J. P., White, N. E., Richards, Z. T., & Allentoft, M. E. (2023). Aquatic environmental DNA: A review of the macro-organismal biomonitoring revolution. Science of the Total Environment, 873, 162322. https://doi.org/10.1016/j.scitotenv.2023.162322
10.1016/j.scitotenv.2023.162322
CAS PubMed Web of Science® Google Scholar
Tamura, K., Stecher, G., & Kumar, S. (2021). MEGA11: Molecular evolutionary genetics analysis version 11. Molecular Biology and Evolution, 38, 3022–3027. https://doi.org/10.1093/molbev/msab120
10.1093/molbev/msab120
CAS PubMed Web of Science® Google Scholar
Tan, M., & Armbruster, J. W. (2018). Phylogenetic classification of extant genera of fishes of the order Cypriniformes (Teleostei: Ostariophysi). Zootaxa, 4476(1), 6–39. https://doi.org/10.11646/zootaxa.4476.1.4
10.11646/zootaxa.4476.1.4
PubMed Web of Science® Google Scholar
Taylor, C. A., Schuster, G. A., Cooper, J. E., DiStefano, R. J., Eversole, A. G., Hamr, P., Hobbs, H. H., III, Robison, H. W., Skelton, C. E., & Thoma, R. F. (2007). A reassessment of the conservation status of crayfishes of the United States and Canada after 10+ years of increased awareness. Fisheries, 32(8), 372–389. https://doi.org/10.1577/1548-8446(2007)32[372:AROTCS]2.0.CO;2
10.1577/1548-8446(2007)32[372:AROTCS]2.0.CO;2
Web of Science® Google Scholar
Thalinger, B., Deiner, K., Harper, L. R., Rees, H. C., Blackman, R. C., Sint, D., Traugott, M., Goldberg, C. S., & Bruce, K. (2021). A validation scale to determine the readiness of environmental DNA assays for routine species monitoring. Environmental DNA, 3(4), 823–836. https://doi.org/10.1002/edn3.189
10.1002/edn3.189
Google Scholar
Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22(22), 4673–4680. https://doi.org/10.1093/nar/22.22.4673
10.1093/nar/22.22.4673
CAS PubMed Web of Science® Google Scholar
Wilcox, T. M., Carim, K. J., McKelvey, K. S., Young, M. K., & Schwartz, M. K. (2015). The dual challenges of generality and specificity when developing environmental DNA markers for species and subspecies of Oncorhynchus. PLoS One, 10(11), e0142008. https://doi.org/10.1371/journal.pone.0142008
10.1371/journal.pone.0142008
PubMed Web of Science® Google Scholar
Wilcox, T. M., Kronenberger, J. A., Young, M. K., Mason, D. H., Franklin, T. W., & Schwartz, M. K. (2024). The unknown unknown: A framework for assessing environmental DNA assay specificity against unsampled taxa. Molecular Ecology Resources, 24(4), e13932. https://doi.org/10.1111/1755-0998.13932
10.1111/1755-0998.13932
CAS PubMed Web of Science® Google Scholar
Wilcox, T. M., McKelvey, K. S., Young, M. K., Engkjer, C., Lance, R. F., Lahr, A., Eby, L. A., & Schwartz, M. K. (2020). Parallel, targeted analysis of environmental samples via high-throughput quantitative PCR. Environmental DNA, 2(4), 544–553. https://doi.org/10.1002/edn3.80
10.1002/edn3.80
Google Scholar
Wilcox, T. M., McKelvey, K. S., Young, M. K., Jane, S. F., Lowe, W. H., Whiteley, A. R., & Schwartz, M. K. (2013). Robust detection of rare species using environmental DNA: The importance of primer specificity. PLoS One, 8(3), e59520. https://doi.org/10.1371/journal.pone.0059520
10.1371/journal.pone.0059520
CAS PubMed Web of Science® Google Scholar
Winter, D. J. (2017). Rentrez: An R package for the NCBI eUtils API. The R Journal, 9, 520–526.
10.32614/RJ-2017-058
Web of Science® Google Scholar
Wright, E. S., Yilmaz, L. S., & Noguera, D. R. (2012). DECIPHER, a search–based approach to chimera identification for 16S rRNA sequences. Applied and Environmental Microbiology, 78, 717–725. https://doi.org/10.1128/AEM.06516-11
10.1128/AEM.06516-11
CAS PubMed Web of Science® Google Scholar
Ye, J., Coulouris, G., Zaretskaya, I., Cutcutache, I., Rozen, S., & Madden, T. L. (2012). Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics, 13, 134. https://doi.org/10.1186/1471-2105-13-134
10.1186/1471-2105-13-134
CAS PubMed Web of Science® Google Scholar
Young, M. K., Isaak, D. J., Schwartz, M. K., McKelvey, K. S., Nagel, D. E., Franklin, T. W., Greaves, S. E., Dysthe, J. C., Pilgrim, K. L., Chandler, G. L., Wollrab, S. P., Carim, K. J., Wilcox, T. M., Parkes-Payne, S. L., & Horan, D. L. (2018). Species occurrence data from the aquatic eDNAtlas database. Forest Service Research Data Archive.
Google Scholar

Citing Literature

Volume6, Issue2

March–April 2024

e548

Filename	Description
edn3548-sup-0001-AppendixS1.pdfPDF document, 8.9 MB	Appendix S1.
edn3548-sup-0002-AppendixS2.xlsxExcel 2007 spreadsheet , 75.6 KB	Appendix S2.
edn3548-sup-0003-AppendixS3.xlsxExcel 2007 spreadsheet , 339.3 KB	Appendix S3.
edn3548-sup-0004-AppendixS4.xlsxExcel 2007 spreadsheet , 94.8 KB	Appendix S4.

Large-scale validation of 46 invasive species assays using an enhanced in silico framework

Abstract

1 INTRODUCTION