Volume 164, Issue 8 pp. ix-x
the AJMG SEQUENCE
Free Access

NIH calls for stronger statistical evidence to support pathogenicity

Geneticists raise concerns about process of identifying genes as possible culprits in disease

First published: 14 July 2014

New guidelines from a workshop convened by the National Institutes of Health (NIH) stress the need for more stringent evidence when implicating particular genes or genetic sequence variants as culprits in human disease.

Published April 24 in the online version of Nature, the guidelines come in response to concerns that some published causality claims are based on insufficiently powered or controlled studies and may lead to inaccurate determinations of pathogenicity. Findings that stop short of these claims are sometimes being interpreted as causal, say study authors [MacArthur et al., 2014].

“To date, an enormous number of false assertions of pathogenicity are contained in the literature,” says guidelines coauthor Heidi L. Rehm, PhD, Director of the Laboratory for Molecular Medicine at Partners HealthCare Personalized Medicine in Cambridge, Massachusetts. “Clinicians must make sure evidence is appropriate before using it to make clinical care decisions.”

As use of genomic sequencing technology grows, “given all these variants in any sequenced genome, it can be easy for researchers to tell stories about how they may cause disease … many will be false,” adds lead author Daniel MacArthur, PhD, Assistant Professor of Genetics at Harvard Medical School and Massachusetts General Hospital in Boston.

These guidelines are intended to prevent such stories. They focus on study design, gene- and variant-level implication, databases, and implications for diagnosis. They suggest a two-step approach to assessing evidence that involves first considering overall support for a causal role of the affected gene in the disease phenotype and then assessing the probability that the patient's variant does indeed play a causal role. The guidelines emphasize statistical evidence over informatics and experimental data, and urge researchers to consider whether any reported evidence of pathogenicity may have arisen by chance.

Addressing a Need

The impetus for the guidelines comes in part from Dr. MacArthur's experience with the 1000 Genomes Project, an international research effort to establish a detailed catalog of human genetic variation. Included are genomes of relatively healthy people who carry variants previously reported to cause severe disease.

“Many of these variants are far too common in healthy individuals to be disease-causing variants,” says Dr. MacArthur. “And going back to the papers, we found that many of these reports actually had very weak evidence that the variants were pathogenic.”

Lack of healthy controls isn't the only problem. Sometimes, researchers do not specifically claim particular genes or variants as culprits in disease but list patient-identified variants in tables and, in doing so, imply pathogenicity. Because the data aren't systematically classified according to likelihood of pathogenicity, “people read the paper and assume a variant is pathogenic when the paper really doesn't say that,” adds Dr. Rehm.

Meanwhile, inaccurate or incomplete information about gene causality is included in databases used by other researchers and geneticists trying to diagnose sick patients.

Details are in the caption following the image
Scientists are being urged to return complete positive and negative evidence before linking genes to human disease.

Recommendations

In response to these problems, the guidelines call on the authors of papers about genetic causality to provide complete positive and negative evidence, not just results consistent with pathogenicity. Readers of published reports should “reassess them as rigorously as your own data,” the guidelines say.

Researchers should also determine and report the formal statistical evidence for segregation or association of each variant and its frequency in large control populations matched as closely as possible to patients' ancestry. Strong evidence that a variant is deleterious—meaning it reduces the reproductive fitness of carriers or damages gene function—does not make it causal, the guidelines add.

Databases of variants should indicate the level of confidence of pathogenicity, supporting evidence, and genotype and phenotype data for both controls and disease patients, the guidelines say. Researchers and clinicians should be satisfied of causality for new disease genes “only when variants in the same genes and similar clinical presentations have been confidently implicated in multiple unrelated individuals,” the guidelines add.

“The main message here is that you really need to read the actual papers [that claim causality],” says Ian Krantz, MD, Professor of Pediatrics at the Perelman School of Medicine at University of Pennsylvania in Philadelphia and chair of the National Human Genome Research Institute–funded Clinical Sequencing Exploratory Consortium.

A Useful Message

Researcher Yaping Yang, PhD, Associate Professor of Molecular and Human Genetics at Baylor College of Medicine in Houston, argues that most labs do read papers scrupulously.

“Clinical geneticists rely heavily on clinical lab reports … [and] labs need to call variants in the right way,” Dr. Yang says. “When clinicians are shopping for labs, they should be aware of how well they follow these guidelines.”

While he says he generally agrees with the guidelines, Dr. Krantz adds, “There is no magic answer to proving functionality.” While the guidelines call for determining genotype-phenotype relationships in multiple individuals, Dr. Krantz notes that sometimes doing so isn't feasible. “A variant for a very rare or novel diagnosis that is only seen in one family could be weighted more if the type of change and evidence for potential pathogenicity is convincing. The impact could be huge for that family,” he explains.

The recommendations aren't meant to be definitive, notes Dr. MacArthur. “Ours are not hard guidelines. They are meant as the beginning of a conversation.”

More Guidelines

A forthcoming update to the 2007 American College of Medical Genetics and Genomics guidelines on interpreting and reporting sequence variants will give much more specific guidance for determining causality of mutations. Dr. Rehm, who is part of the committee writing the update, says the guidelines will list criteria for assigning pathogenic mutations to different tiers indicating the level of evidence supporting variants' role in causing disease.

This approach is similar to one taken by the Clinical Genome Resource Consortium (ClinGen) funded by NIH. It builds upon the paper by MacArthur et al. and classifies gene-disease relationships in six tiers, including definitive, strong, moderate, limited, no evidence, and disputed evidence, if there is evidence both for and against causality.

Assigning such evidence levels to genes and variants will let researchers, laboratorians, and clinicians know if the use of genetic test results is appropriate in specific situations. Including a gene on a clinical panel test, returning a variant as an incidental finding, or reporting a novel gene identified by exome analysis all require different levels of evidence, Dr. Rehm points out.

“Whole-exome and whole genome sequencing are really powerful tools for finding causal variants, but the message here is you have to do careful vetting before calling something pathogenic,” Dr. Krantz says. “Have mindfulness about what you're doing because it has a big impact on the family, on their reproductive choices, and [on the] care of the affected individual. It's imperative to get things right.”

    The full text of this article hosted at iucr.org is unavailable due to technical difficulties.