Volume 39, Issue 11 pp. 1686-1689
SPECIAL ARTICLE
Full Access

ClinGen advancing genomic data-sharing standards as a GA4GH driver project

Lena Dolman

Lena Dolman

Global Alliance for Genomics and Health Headquarters, Ontario Institute for Cancer Research, Toronto, Ontario, Canada

Search for more papers by this author
Angela Page

Angela Page

Broad Institute of MIT and Harvard, Cambridge, Massachusetts

Search for more papers by this author
Lawrence Babb

Lawrence Babb

Sunquest Information Systems, Boston, Massachusetts

Search for more papers by this author
Robert R. Freimuth

Robert R. Freimuth

Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota

Search for more papers by this author
Harindra Arachchi

Harindra Arachchi

Broad Institute of MIT and Harvard, Cambridge, Massachusetts

Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts

Search for more papers by this author
Chris Bizon

Chris Bizon

Renaissance Computing Institute, University of North Carolina, Chapel Hill, North Carolina

Search for more papers by this author
Matthew Brush

Matthew Brush

Oregon Clinical & Translational Research Institute, Oregon Health & Science University, Portland, Oregon

Search for more papers by this author
Marc Fiume

Marc Fiume

Global Alliance for Genomics and Health Headquarters, Ontario Institute for Cancer Research, Toronto, Ontario, Canada

DNAstack, Toronto, Ontario, Canada

Search for more papers by this author
Melissa Haendel

Melissa Haendel

Oregon Clinical & Translational Research Institute, Oregon Health & Science University, Portland, Oregon

Linus Pauling Institute, Oregon State University, Corvallis, Oregon

Search for more papers by this author
David P. Hansen

David P. Hansen

Australian e-Health Research Centre, CSIRO, UQ Health Sciences Building, Herston, Qld, Australia

Search for more papers by this author
Aleksandar Milosavljevic

Aleksandar Milosavljevic

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Search for more papers by this author
Ronak Y. Patel

Ronak Y. Patel

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Search for more papers by this author
Piotr Pawliczek

Piotr Pawliczek

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Search for more papers by this author
Andrew D. Yates

Andrew D. Yates

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, United Kingdom

Search for more papers by this author
Heidi L. Rehm

Corresponding Author

Heidi L. Rehm

Broad Institute of MIT and Harvard, Cambridge, Massachusetts

Center for Mendelian Genomics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts

Center for Genomic Medicine, Massachusetts General Hospital, Boston, Massachusetts

Correspondence

Heidi L. Rehm, The Broad Institute of MIT and Harvard, Cambridge, MA.

Email: [email protected]

Search for more papers by this author
First published: 11 October 2018
Citations: 13

For the ClinGen/ClinVar Special Issue

Abstract

The Clinical Genome Resource (ClinGen)’s work to develop a knowledge base to support the understanding of genes and variants for use in precision medicine and research depends on robust, broadly applicable, and adaptable technical standards for sharing data and information. To forward this goal, ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely-available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health-related data. In its capacity as one of the 15 inaugural GA4GH “Driver Projects,” ClinGen is providing input on the key standards needs of the global genomics community, and has committed to participate on GA4GH Work Streams to support the development of: (1) a standard model for computer-readable variant representation; (2) a data model for linking variant data to annotations; (3) a specification to enable sharing of genomic variant knowledge and associated clinical interpretations; and (4) a set of best practices for use of phenotype and disease ontologies. ClinGen's participation as a GA4GH Driver Project will provide a robust environment to test drive emerging genomic knowledge sharing standards and prove their utility among the community, while accelerating the construction of the ClinGen evidence base.

1 INTRODUCTION

The Clinical Genome Resource (ClinGen) is building a central knowledge base for understanding the clinical relevance of genes and variants for use in precision medicine and research (Rehm et al., 2015). This includes the curation of genes for disease validity, dosage sensitivity, and actionability, as well as the curation of variants for pathogenicity. Curated variants are shared in the National Center for Biotechnology Information's ClinVar data archive and curated genes are available on the ClinGen website. These resources depend on robust technical standards for sharing data and information that are broadly applicable to a variety of use cases and are adaptable across a diversity of countries and systems, including both clinical and nonclinical settings. ClinGen is working to (1) standardize the clinical annotation and interpretation of genomic variants, (2) enable clinicians, researchers, and patients to share evidence including genomic and phenotypic data, and (3) provide unrestricted access to its knowledge base for direct use as well as integration into electronic health records and other resources. As part of this effort, ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH; www.ga4gh.org), an international, nonprofit alliance that is catalyzing the creation of technical standards and regulatory frameworks to enable responsible, voluntary, and secure sharing of genomic and health-related data across institutional and national boundaries.

Formed in 2013 to accelerate the potential of research and medicine to advance human health (Page et al., 2016), the GA4GH membership brings together over 500 leading organizations as well as individual contributors working in healthcare, research, patient advocacy, life science, and information technology, from across more than 70 countries. In October 2017, GA4GH launched a new phase (“GA4GH Connect”, [https://www.ga4gh.org/docs/GA4GH-Connect-A-5-year-Strategic-Plan.pdf]) that depends on the expertise of real-world clinical and research projects to establish priorities and needs within the community. These real-world “Driver Projects” provide contexts for international genomic data sharing by: (1) establishing priorities for tool development, (2) contributing to the creation of technical standards, policies, and other deliverables, and (3) implementing GA4GH standards into real-world use in order to provide feedback and demonstrate the value of genomic data sharing to the broader community.

Previously, ClinGen and GA4GH have worked together on developing guidelines for sharing pediatric genomic data (Friedman et al., 2018; Rahimzadeh et al., 2018) and variant-level information with ClinVar (Azzariti et al., 2018), and developing consent resources for clinical genomic data sharing (Riggs et al., 2018). ClinGen is also a key participating group within the BRCA Challenge, one of four early demonstration projects that helped in launching and demonstrating the value of GA4GH. The BRCA Challenge launched the BRCA Exchange (https://brcaexchange.org/) that brings together all publically accessible variant resources on BRCA1 and BRCA2, including ClinVar as the primary source of interpreted BRCA1 and BRCA2 variants. Following this established history of mutual collaboration, ClinGen was invited to serve as one of the 15 inaugural GA4GH Driver Projects, alongside other leading research and clinical initiatives in North America, Europe, and Australia. In this role, ClinGen is contributing to the development of standards for discovering, accessing, storing, and analyzing genomic and health-related data that will be used by projects across the globe, including national precision medicine initiatives, such as the US-based All of Us Research Program (Collins and Varmus, 2015) and Genomics England (Marx, 2015), both of which are also participating as GA4GH Driver Projects. It will also play a leadership role in the representation of genomic knowledge for use in the accurate interpretation of genomic data.

In February 2018, GA4GH released a strategic roadmap (https://www.ga4gh.org/howwework/strategic-roadmap.html) based on the input and guidance received from Driver Projects regarding immediate, key needs for enabling data sharing. The roadmap describes 28 deliverables that will be produced by eight GA4GH Work Streams over the next 1–3 years; all of the deliverables build upon the Framework for Responsible Sharing of Genomic and Health-Related Data (Knoppers, 2014) and will be made freely and openly available on https://www.ga4gh.org/. ClinGen and the other GA4GH Driver Projects will work together to support the development of these deliverables and to ensure their relevance for use in real-world projects, as representatives of the broader genomics community. ClinGen has committed to participate in multiple subgroups across four of the Work Streams, with key contributions listed below. In particular, ClinGen will help to:
  1. Create a standard model for computer readable variant representation. Genomic variants are described with many naming conventions, making it difficult to unambiguously define a variant and ensure the accurate use of associated knowledge. ClinGen is leveraging prior work done by its Data Modeling Work Group (including experience in developing the ClinGen Allele Registry [reg.clinicalgenome.org]), contributing examples, and providing domain expertise, to inform efforts within GA4GH to develop a data model for unambiguous representation of variants. This work began within GA4GH as the Variant Modeling Consortium (VMC; https://github.com/ga4gh/vmc]) and is now continued through the variant representation subgroup of the Genomic Knowledge Standards Work Stream (GKS). GKS includes representatives from other organizations, including HL7 Fast Healthcare Interoperable Resources (FHIR; https://hl7.org/fhir/) and Human Genome Variation Society (HGVS), ensuring that its standards meet the needs of the clinical and genomics communities and are compatible with HGVS standards that are widely used to contextualize genetic variation. Notably, both VMC and the ClinGen Allele Registry, as transmission formats, have the potential to adapt to each other fluidly, and the ClinGen Allele Registry is working with the GA4GH group to define a pilot project to implement the GKS Variant Representation specification. A 0.1 release of VMC has been released and already proposes a language and nomenclature for describing variation.

  2. Develop a data model for linking variant data to annotations. Standardized variant annotation and interpretation is central to ClinGen's mission and is an area in which the consortium has considerable expertise. ClinGen and the Monarch Initiative (Mungall et al., 2016) are working with the GA4GH GKS Work Stream to develop a common data model to guide the linkage of variant evidence to clinical interpretations with a standard format. This includes support for applying current professional interpretation standards (e.g., ACMG/AMP [Richards et al., 2015]) in a computable manner that can be validated, as well as documenting the associated disease and inheritance pattern.

  3. Develop a network for sharing knowledge about genomic variants and associated clinical interpretations. Sharing curated genomic knowledge with databases, such as ClinVar is a high priority for the genomics community. Building off of work in the GKS and Clinical & Phenotypic Data Capture (CPDC) Work Streams, the Discovery Work Stream will develop standards for sharing variant classifications and supporting evidence. The effort will standardize technical descriptions of a variant and its attributes (e.g., clinical significance) to streamline the electronic submission of clinically relevant information to genomic knowledge bases, such as ClinVar. The data models will build off of GA4GH standards developed by the GKS and CPDC Work Streams in the areas of variant annotation and phenotyping, and will be implemented within ClinGen's knowledge bases, as well as disseminated to the broader community for widespread use. Facilitating knowledge exchange between disparate sources will enable the development of integrative and comprehensive applications helping to inform clinicians of the consequences and impacts of genomic variant events.

  4. Establish recommended phenotype and disease ontologies and best practices for their use in genomic medicine. Interpretation of genes and variants and their possible role in a patient's disease requires associating genes and variants to diseases and phenotypic features. The GA4GH CPDC Work Stream is developing standards, best practices, and benchmarking for the use of ontologies and clinical terminologies to capture clinical phenotype information and gene–disease associations for use in genomic medicine and to ensure data captured clinically can be used in genomic research. CPDC will also develop standards and best practices for how clinical phenotype information can be exchanged between clinical information systems and with research, through using the emerging HL7 FHIR and Phenopackets (https://phenopackets.org/) standards. ClinGen is implementing these standardized disease and phenotype ontologies into its gene–disease curation efforts as well as incorporating and testing developed phenotyping standards in its data capture approaches, including through its GenomeConnect patient registry (Kirkpatrick et al., 2015).

In summary, ClinGen is a critical Driver Project for GA4GH, providing a robust environment to test drive genomic knowledge-sharing standards and prove their utility in the sharing of evidence and knowledge among the community, as well as applying that knowledge to clinical care and research. In exchange, ClinGen can more quickly and consistently build its evidence base by working with GA4GH to disseminate and instantiate the collaboratively built standards through global involvement and engagement.

ACKNOWLEDGMENTS

Core funders of the Global Alliance for Genomics and Health include the Broad Institute of MIT and Harvard; CanSHARE (Génome Québec, Genome Canada, the Government of Canada, Ministère de l'Économie, Innovation et Exportation du Québec, and the Canadian Institutes of Health Research [fund #141210]); Genome Canada; Ontario Institute for Cancer Research (funded by the Ontario Ministry of Economic Development, Job Creation, and Trade); the U.S. National Institutes of Health (Big Data to Knowledge (BD2K), National Cancer Institute, National Heart, Lung, and Blood Institute, and National Human Genome Research Institute); Wellcome (WT201535/Z/16/Z); and Wellcome Sanger Institute. Additional funding is obtained through annual sponsorships from GA4GH Organizational Members.

Additional contributions to this work included support by the following grants and organizations: NIH/National Human Genome Research Institute: U41HG006834 (ClinGen-Rehm), U41HG009649 (ClinGen-Bustamante), U41HG009650 (ClinGenBerg), HG008900 (Broad Center for Mendelian Genomics); NIH Office of the Director: 5R24OD011883 (Monarch Initiative); European Molecular Biology Laboratory.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.