The Human Variome Project
For the 25th Anniversary Commemorative Issue
ABSTRACT
The practical realization of genomics has meant a growing realization that variant interpretation is a major barrier to practical use of DNA sequence data. The late Professor Dick Cotton devoted his life to innovation in molecular genetics and was a prime mover in the international response to the need to understand the “variome.” His leadership resulted in the launch first of the Human Genetic Variation Society and then, in 2006, an international agreement to launch the Human Variome Project (HVP), aimed at data integration enabled by standards and infrastructure of the databases of variants being identified in families with a range of inherited disorders. The project attracted a network of affiliates across 81 countries and earned formal recognition by UNESCO, which now hosts its biennial meetings. It has also signed a Memorandum of Understanding with the World Health Organization. Future progress will depend on longer term secure funding and integration with the efforts of the genomics community where the rapid advances in sequencing technology have enabled variant capture on a previously unimaginable scale. Efforts are underway to integrate the efforts of HVP with those of the Global Alliance for Genomics and Health to provide a lasting legacy of Dick Cotton's vision.
During a long and distinguished career, Dick made major contributions: he was a postdoctoral fellow in Cesar Milstein's laboratory where he helped lay the foundations for the discovery of monoclonal antibodies, for which Milstein was awarded the Nobel Prize in 1984. He played a key role in the discovery of the genetic basis of phenylketonuria and attracted global attention with his innovative use of chemical and enzymatic cleavage in mutation detection. He recognized the need to make mutation research a distinct academic pursuit, convincing the publisher John Wiley & Sons to start a new journal dedicated to mutation research and becoming founding Co-Editor of said journal Human Mutation and initiating the biennial Mutation Detection conferences. His reputation was pivotal in attracting the support of WHO, UNESCO, OECD, The European Commission, Center for Disease Control, and the March of Dimes to launch the HVP (http://www.humanvariomeproject.org); his early recognition that we must join forces if we are to grasp the enormous complexity of human genetic variation and its relationship to disease may prove to be the aspect of his legacy which will have the greatest impact on humanity.
Dick had begun to consider this problem almost two decades earlier during his time as co-leader with David Danks of the famous Murdoch Children's Research Institute in Melbourne. In the early 90s, he led the creation of a group focused on mutation detection within the Human Genome Organization, the HUGO Mutation Database Initiative that led to the creation in 2001 of the Human Genome Variation Society (HGVS; http://www.hgvs.org). The most notable of the HGVS achievements was the development of the widely acclaimed variation nomenclature standards. Throughout this time, clinicians and academics around the world were accumulating private databases of genetic variants in their favorite genes underlying high profile monogenic disorders. Pre-eminent were the databases developed for the CFTR gene in cystic fibrosis, the hemoglobinopathies and some of the DNA repair syndromes like Bloom syndrome and Fanconi Anemia. In the mid-90s, the discovery of the major cancer predisposition genes BRCA1 and BRCA2 and the Mismatch Repair (MMR) genes had a profound influence on the field. Not only were these genes predictive of a major sub-set of the most common cancers, they also allowed intervention to reduce morbidity and mortality in extended families. As the history of the HVP unfolded, these discoveries had a major influence on its form and future.
The HVP attracted widespread support within the academic community with more than a thousand affiliates at the last count (Cotton et al., 2008). The underlying concept was to move toward a more formalized curation system based on data shared between “country nodes.” It was recognized that this would be necessary because of the significant differences between different jurisdictions in terms of data protection legislation, public perception of genetics, research infrastructure, and diagnostic service provision. Two of the most influential territories illustrated the divergence; in the United Kingdom, the well-developed network of regional genetics centers providing laboratory diagnostic and counseling services had received government investment following a government white paper in 2003 including the creation of two reference laboratories, investment in equipment and staff and six “genetic knowledge parks.” A national database, Diagnostic Mutation Database, was created within the National Health Service but with no formal means of maintenance or data sharing with other databases.
In the USA, the decision by the diagnostic company Myriad to restrict testing of the BRCA1/2 genes to its own laboratory had accelerated the transfer of molecular diagnostic services away from their academic base into the private sector. In the absence of a recognized mainstream specialty of Clinical Genetics, most genetic services evolved around academic teams and within individual specialties. Larry Brody's efforts led to the creation of the influential database of breast cancer variants, The Breast Cancer Information Core (Szabo et al., 2000; http://research.nhgri.nih.gov/projects/bic/) but again its current value is limited by a decision by Myriad not to share data and a lack of formal curation to enable clinical scientists to rely on its data in diagnostic reporting.
National health systems responded in a piecemeal way in most cases, leaving data interpretation to informal groupings of diagnostic laboratories. In France, the Universal Mutation Database drawing data from more than 40 diagnostic laboratories began to develop a valuable resource but again this was at the national level without any system for sharing other than via publication.
In 2003, the International Society for Gastrointestinal Hereditary Tumours, InSiGHT, was created from research groups focused on the polyposis syndromes and hereditary non-polyposis colon cancer or Lynch syndrome. As its first president, John Burn invited colleagues managing registers of variants in the MMR genes to come together for a workshop with a view to integrating their efforts. Out of this initiative emerged the InSiGHT database hosted by the Leiden Open Variant Database run by Johan den Dunnen (http://chromium.lovd.nl/LOVD2/colon_cancer/home.php).
At the second meeting of the society in Yokohama in 2007, Dick Cotton presented his vision of a global solution to the variant challenge and won the support of the society to bring the InSiGHT database under the HVP umbrella. At the Yokohama meeting, several subcommittees were formed, of which the most far reaching has been the now 50 member strong Variant Interpretation Committee chaired by Maurizio Genuardi, tasked with providing the most authoritative interpretation on all submitted MMR variants, based on an assembly of clinical, in silico, and functional data also submitted to the database. The international curation team was strongly grounded in the academic support from Amanda Spurdle and Bryony Thompson in Brisbane. Now, nearly all reports on MMR variants from diagnostic laboratories are couched with a reference to the InSiGHT variant database, reflected by the 60,000 page hits the database receives monthly. The InSiGHT database continues to be a model for the development of curated databases with a global reach with a curator, John-Paul Plazzer, supported by local funds in Melbourne under the guidance of Finlay Macrae (Thompson et al., 2014).
A critical milestone in the progress of the HVP was the agreement of the United Nations Education, Science and Culture Organization (UNESCO) in 2011 to recognize the HVP as an affiliated non-governmental organization, enabling HVP to hold its biennial conferences at the UNESCO Paris headquarters and encouraging international representatives to support progress toward the development of country and regional nodes. By 2012, the organization had attracted considerable support with a presence in 81 countries and a pledge from the Chinese Country Core HVP Node to provide financial and logistical support. Sadly, it failed to honor this pledge. Similarly, the Victorian State Government pledged funding but a change of Government also lead to this pledge not being honored. Despite these setbacks, the organization continued its work with philanthropic support. It continued to grow the networks to collect data and to convene its biennial meetings with HVP5 in 2014 and HVP6 planned for June 2016.
In 2015, the HVP achieved the highest affiliation status with UNESCO, that of “Associate,” in recognition of the organization's development of strong relationships with low and middle-income countries. The HVP has led the formation, and is included in the membership of, two UNESCO expert panels; one to make policy recommendations for national approaches to the responsible collection, curation, and sharing of genetic variation information, and the other to document and identify gaps in the technical standards and best-practice guidelines for variant collection, curation, interpretation, and sharing.
Negotiations continued with representatives of the World Health Organization (WHO) leading, in 2015, to a further step toward formal recognition as a non-governmental organization with the signing of a Memorandum of Understanding. WHO endorsement will contribute toward implementation of genomics in clinical practice across the world.
In 2013 a new organization, the Global Alliance for Genomics and Health, entered the arena. Launched by major genomics centers of international repute, this new structure represented a coming together of genomicists focused on the big data approach to essentially the same challenge that had galvanized Dick Cotton and colleagues two decades earlier. The importance of this new development was the opportunity to address the enormity of the human variome from the perspective of the ever-growing power of large-scale sequencing.
In 2014, the Global Alliance and the HVP joined forces to launch the BRCA Challenge. Despite the central importance of BRCA1 and BRCA2 in molecular diagnostics worldwide, not least as a result of the decision of Angelina Jolie to reveal her decision to undergo radical preventive surgery having inherited a pathogenic BRCA1 variant from her late mother, diagnostic services remain poor in most parts of the world. Angelina's courage raised awareness to such an extent that in many genetic centers, referrals doubled. Here was a perfect opportunity to link the power of genomics with the granular knowledge of genotypes and phenotypes held by research and diagnostic geneticists around the world. In 2016, the resulting public Website, BRCA Exchange (http://www.brcaexchange.org), will be made public and will, with industry and charitable support, be curated by an international community based on the longstanding research group ENIGMA (evidence-based network for the Interpretation of Germline Mutant Alleles). This will illuminate the essence of the mission statement of the HVP Roadmap:
“To increase the amount of clinically validated and classified data on genomic variants available on the internet in free and open, curated databases, by means of cooperation at the national, regional, and international levels, so that all people can benefit from advances in genomic medicine”.
However, this is only one of the dimensions of the work of the HVP. Progress is being made to build international consensus around a public database of the pathogenic alleles underlying the genetic hemolytic anemias; the Global Globin Challenge 2020. This will help to manage a huge burden on the developing world and demonstrating the critical importance of tapping into its critically important resource: the sequences and phenotypes of the myriad of genetically diverse ethnic groups that make up our human species.