Volume 38, Issue 9 pp. 1072-1084

SPECIAL ARTICLE

Free Access

Objective assessment of the evolutionary action equation for the fitness effect of missense mutations across CAGI-blinded contests

Panagiotis Katsonis,

Panagiotis Katsonis

orcid.org/0000-0002-7172-1644

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Search for more papers by this author

Olivier Lichtarge,

Corresponding Author

Olivier Lichtarge

[email protected]

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas

Department of Pharmacology, Baylor College of Medicine, Houston, Texas

Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas

Correspondence

Olivier Lichtarge, Department of Molecular and Human Genetics, Baylor College of Medicine, BCM225, One Baylor Plaza, Houston, TX 77030.

Email: [email protected]

Search for more papers by this author

Panagiotis Katsonis,

Panagiotis Katsonis

orcid.org/0000-0002-7172-1644

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Search for more papers by this author

Olivier Lichtarge,

Corresponding Author

Olivier Lichtarge

[email protected]

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas

Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas

Department of Pharmacology, Baylor College of Medicine, Houston, Texas

Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas

Correspondence

Olivier Lichtarge, Department of Molecular and Human Genetics, Baylor College of Medicine, BCM225, One Baylor Plaza, Houston, TX 77030.

Email: [email protected]

Search for more papers by this author

First published: 23 May 2017

https://doi.org/10.1002/humu.23266

Citations: 22

Contract grant sponsors: National Institutes of Health (GM079656 and GM066099; U41 HG007446 and R13 HG006650); National Science Foundation (DBI-1062455 and CCF-0905536).

For the CAGI Special Issue

Share a link

Email
Wechat
Bluesky

Abstract

A major challenge in genome interpretation is to estimate the fitness effect of coding variants of unknown significance (VUS). Labor, limited understanding of protein functions, and lack of assays generally limit direct experimental assessment of VUS, and make robust and accurate computational approaches a necessity. Often, however, algorithms that predict mutational effect disagree among themselves and with experimental data, slowing their adoption for clinical diagnostics. To objectively assess such methods, the Critical Assessment of Genome Interpretation (CAGI) community organizes contests to predict unpublished experimental data, available only to CAGI assessors. We review here the CAGI performance of evolutionary action (EA) predictions of mutational impact. EA models the fitness effect of coding mutations analytically, as a product of the gradient of the fitness landscape times the perturbation size. In practice, these terms are computed from phylogenetic considerations as the functional sensitivity of the mutated site and as the magnitude of amino acid substitution, respectively, and yield the percentage loss of wild-type activity. In five CAGI challenges, EA consistently performed on par or better than sophisticated machine learning approaches. This objective assessment suggests that a simple differential model of evolution can interpret the fitness effect of coding variations, opening diverse clinical applications.

1 INTRODUCTION

Numerous computational methods seek to predict the impact of genetic variations on fitness (Cardoso, Andersen, Herrgård, & Sonnenschein, 2015; Jordan, Ramensky, & Sunyaev, 2010; Katsonis et al., 2014). Most of them focus on protein-coding variants, which are single-nucleotide substitutions that change an amino acid in the encoded protein. Although protein-coding genes only constitute less than 2% of the human genome, it is estimated that they harbor 85% of disease-related mutations (Choi et al., 2009). Several methods rely purely on homology information, estimating whether a given substitution fits with the amino acid differences observed in other species at that same residue position (Choi, Sims, Murphy, Miller, & Chan, 2012; Ng & Henikoff, 2001; Reva, Antipin, & Sander, 2007; Reva, Antipin, & Sander, 2011; Stone & Sidow, 2005). However, the vast majority of the methods also apply machine learning techniques, trained over large datasets and numerous features that may include sequence conservation, functional site information, solvent accessibility, secondary structure, crystallographic B factors, local sequence environment, and intrinsic disorder, among others (Adzhubei et al., 2010; Bromberg & Rost, 2007; Capriotti et al., 2013; Carter et al. 2013; Fariselli, Martelli, Savojardo, & Casadio, 2015; Karchin et al., 2005; Kircher et al., 2014; Li et al., 2009; Liu, Jian, & Boerwinkle, 2011; Niroula et al., 2015; Schwarz et al., 2014; Wei et al., 2013; Yue & Moult, 2006). Although some studies support clinical value (Chan et al., 2007), the performance of these methods is generally mixed with limited agreement to each other (Castellana & Mazza, 2013) and with clinical or experimental data (Flanagan, Patch, & Ellard, 2010; Miosge et al., 2015; Tchernitchko, Goossens, & Wajcman, 2004; Walters-Sen et al. 2015). A common problem, for example, is that performance is sensitive to the availability of sufficient protein homology and structure information (Hicks, Wheeler, Plon, & Kimmel, 2011; Marini, Thomas, & Rine, 2010). A deeper problem is that the integrative modeling of the multiscale impact of a mutation from the protein to the pathway, to the network and on to a cell, a tissue, and an organism appears far too complex for current tools. In search of an alternative approach that focuses on overall fitness effect, we derived an evolutionary action (EA) equation for the fitness effect of coding genetic changes (Katsonis & Lichtarge, 2014). EA is the product between the functional sensitivity (i.e., importance) of the mutated protein sequence position and the size of the mismatch introduced by the amino acid switch. As such, EA requires no specific training. Its performance was evaluated against large mutagenesis study datasets (Katsonis & Lichtarge, 2014), but the CAGI challenges provided a unique opportunity for independent, objective assessment.

The community of Critical Assessment of Genome Interpretation (CAGI) aims to objectively assess computational methods for predicting the phenotypic impact of genomic variations. Until now, CAGI has organized four contests in 2010, 2011, 2013, and 2015 that involved a total of 37 challenges. Only nine of these challenges asked predictors to estimate the fitness effect of single genetic variants and were suited for EA. The rest of the challenges focused on whole-exome sequencing data interpretation of complex traits or on specific tasks that a fitness impact predictor cannot address directly, such as case-control distributions and activity restoration, among others. We applied EA to seven of these fitness effect challenges, as the EA method was unavailable during the first CAGI experiment and a deadline for the NPM-ALK challenge in CAGI 4 was missed. Here, we report on five challenges, after we excluded two. First, the BRCA challenge of CAGI 3 (2013) because the variant classification was not robust, leaving 52 of 62 missense variants (more than 80%) annotated as variants of unknown significance, and then the SCN5A challenge of CAGI 2 (2011) because it only involved three variants, too few to drive conclusions. In order to assess the methods objectively, CAGI assigned independent assessors to each challenge, often from the team that provided the experimental data. Assessors had freedom to choose any assessment tests and strategy. Most often, assessors used multiple tests that evaluate either the rank of the predictions or their proximity to experimental values. Predictions that perform well in tests of the first type may not necessarily perform well in the tests of the other type, and vice versa. Also, there may be tradeoff between some tests, such as between precision and recall (Buckland & Gey, 1994), since the choice of a cutoff may favor performance in one test at the expense of the performance in the other test. Therefore, integrating multiple tests into one overall score has been a common practice in CAGI challenges. Some assessors avoided highlighting one of the methods as the winner, but they rather presented a comprehensive view of the strengths and weaknesses of the submitted methods, often of those with the best performances.

We used our EA method to address CAGI challenges that asked for predictions of the functional and clinical impact of missense mutations. Specifically, we participated in three challenges of CAGI 4 (SUMO ligase, pyruvate kinase, and N-acetyl-glucosaminidase [NAGLU]), in one challenge of CAGI 3 (p16), and one challenge of CAGI 2 (cystathionine beta-synthase [CBS]). In each case, the EA scores ranked the amino acid substitutions by their predicted impact on fitness, so that substitutions with larger fitness changes had high scores (see Methods). Since EA scores have already been shown to correlate with the fraction of deleterious mutations in four different experimental systems (Katsonis & Lichtarge, 2014), these EA scores, here, were treated as the probability of a substitution to be deleterious for the protein function. The final predicted values were modified specifically to match the experimental scales of each challenge with simple linear transformations. This last choice is simple, but potentially introduces errors if the assay sensitivity was nonlinear.

Briefly, the EA method was based on the hypothesis that protein evolution proceeds in infinitesimal fitness steps (Fisher 1930; Orr, 2005) and so can be described by a continuous and differentiable evolutionary function that links genotype and phenotype. If so, a mutation can be viewed as a perturbation of the genotype and its effect on the phenotype should be given by differentiating the evolutionary function. This leads to the action of a single missense mutation as a product of the gradient of the evolution function and the magnitude of the mutation. The gradient can be understood as the sensitivity of a protein sequence position to amino acid substitution, that is, the importance of the genotype position as measured by the evolutionary trace (ET) algorithm (Lichtarge & Wilkins, 2010; Lichtarge, Bourne, & Cohen, 1996; Mihalek, Res, & Lichtarge, 2004). The magnitude of the amino acid change can be approximated with context-dependent log odds. Together, these terms yield the EA scores. Of note, the ET algorithm, aka the gradient of the evolutionary function, has been used in broad applications, such as to identify functional sites and allosteric pathway residues (Yao et al., 2003), guide mutations that block or reprogram function (Rodriguez, Yao, Lichtarge, & Wensel, 2010), and define structural motifs that predict function on large scale (Erdin, Ward, Venner, & Lichtarge, 2010; Ward et al., 2009), such as substrate specificity (Amin, Erdin, Ward, Lua, & Lichtarge, 2013). Also, the use of amino acid substitution log odds is a well-established measure of amino acid similarity (Henikoff & Henikoff, 1992) and its context dependence is well known (Overington, Donnelly, Johnson, Åali, & Blundell, 1992), although a dependence on predicted functional importance was first used in calculating the EA scores (Katsonis & Lichtarge, 2014). In that same study, EA was predictive on large data sets of experimental assays of molecular function, clinical associations with human disease, and population allelic frequency of human polymorphisms, so that the EA equation matched positive controls and was validated across multiple biological scales.

Here, we reviewed the performance of EA on the CAGI challenges. For each challenge, first, we examined qualitatively the relationships between the experimental values and the EA scores by binning the data points according to the experimental or the predicted values. Then, we presented the objective assessments of the CAGI assessors and provided details on what assessment tests were used and whether an overall ranking that weighs multiple tests have been provided by the assessor. Last, for all challenges, we showed the performance of the submissions according to Pearson's correlation coefficient and receiver operating characteristic plots that were performed by the authors in order to compare performance between different challenges. We also calculated these two tests for two well-established methods, PolyPhen2 (Adzhubei et al., 2010) and SIFT (Ng & Henikoff, 2001), as points of reference (details on using these predictors can be found in Methods). Depending on the dataset availability, we also examined whether the submitted predictions performed better on subsets of mutations that had low standard deviation of experimental replicates and therefore higher confidence for the experimental values.

1.1 EA Approach

In order to assess the impact of mutations, we considered a sequence space (Smith, 1970) that mapped onto a fitness landscape (Wright, 1932). There, each mutation should cause a small displacement away from an idealized equilibrium position for the species, defined as an average over the fitness landscape position of all individuals for that species. Let (r₁, r₂, …, r_i, …, r_n)_P be the genotype, γ, of the protein of interest, P, and φ be the fitness phenotype that integrates all the structural, dynamic, and functional attributes that affect the survival and reproduction of the organism. Our hypothesis is that γ and φ are coupled to each other by a continuous and differentiable evolutionary fitness function f, which implicitly accounts for all selection constraints and their variations over time. If so, a small genotype perturbation dγ will change the fitness phenotype by dφ, which will be given by:

$urn:x-wiley:10597794:media:humu23266:humu23266-math-0001$ (1)

where ∇f is the gradient of f and • denotes the scalar product. For a single amino acid change at sequence position i, from X to Y, the genotype perturbation becomes the magnitude of the substitution (Δr_i,X_→_Y) and the gradient becomes the partial derivative of f for its i^th component (∂f/∂r_i). Neglecting higher-order terms arising from epistatic interactions with co-occurring mutations (Breen, Kemena, Vlasov, Notredame, & Kondrashov, 2012; Marks et al., 2011), the phenotypic change, or action, of the amino acid substitution becomes:

$urn:x-wiley:10597794:media:humu23266:humu23266-math-0002$ (2)

This is the EA equation, which states that a missense mutation displaces fitness from its equilibrium position by an amount that is proportional to the evolutionary fitness gradient at that site and the magnitude of the amino acid change. Critically, although the function f is unknown, the terms of expression (Equation 2) may nevertheless be approximated from empirical data on protein evolution.

We approximated the evolutionary fitness gradient ∂f/∂r_i with the relative importance ranks of the ET method (Lichtarge & Wilkins, 2010; Lichtarge et al., 1996; Mihalek et al., 2004). The gradient represents the displacement of the fitness phenotype for an elementary genotype change. We hypothesized that evolution proceeds in infinitesimal steps (Orr, 2005), so any spontaneous amino acid change in protein evolution is an elementary genotype change that adapts fitness in the genetic and environmental context the protein operates (Coyne & Orr, 1998). We also hypothesized that f is continuous and differentiable, so the gradient equals to the difference in fitness phenotype caused by an elementary genotype change. Together, these two hypotheses suggest that the gradient can be measured by quantifying the correlation of amino acid variation and phylogenetic branching, such as the ET algorithm does (Lichtarge et al., 1996). In the extreme cases, invariant sequence positions yield the maximum evolutionary fitness gradient because any genotypic change can displace fitness beyond any homologous protein, whereas positions that vary even between the closest homologous sequences yield the minimum evolutionary fitness gradient.

To measure the magnitude of a substitution (Δr_i,X_→_Y), we used the odds of observing each substitution in homologous proteins (Henikoff & Henikoff, 1992; Overington et al., 1992). For example, the amino acid alanine is substituted to serine more often than to aspartate, in line with greater biophysical and chemical similarities to the former. However, we found that the substitution odds also depend on the evolutionary gradient of the substituted position. For example, the alanine to valine substitution odds form a bell-shaped distribution as the evolutionary gradient at the mutated position varies from maximum to minimum; those of alanine to threonine begin flat and then tail off, whereas those of alanine to aspartate decay steadily (Katsonis & Lichtarge, 2014). Similarly, differences in the substitution odds were found depending on structural features (Overington et al., 1992). Therefore, we approximated Δr_i,X_→_Y by substitution odds that depend on the evolutionary importance and on protein structure features of the residue.

2 METHODS

2.1 Calculation of the EA

EA scores were calculated according to the public Web server available for nonprofit use at the URL: mammoth.bcm.tmc.edu/uea, where the human protein name and the amino acid substitution may be used as input. Briefly, the EA Δφ of each mutation was the product of the evolutionary gradient ∂f/∂r_i and the perturbation magnitude of the substitution, Δr_i,X_→_Y. These two terms, ∂f/∂r_i and Δr_i,X_→_Y, were measured by percentile ranks of ET scores and of amino acid substitution odds, respectively, as described previously (Katsonis & Lichtarge, 2014). All terms, including the EA scores, have been used in the form of percentile ranks, such that high or low scores indicated high or low impact of the genetic variation, respectively. For example, an EA of 68 implied that the impact was higher than 68% of all possible amino acid substitutions in a protein.

2.2 Calculation of other predictors of mutation impact

SIFT predictions were obtained using “SIFT BLink” (http://sift.jcvi.org/), where we provided the GI number of the query protein. Specifically, we used the GI numbers of 4557415 (CBS), 4502749 (p16), 4507785 (SUMO ligase), 32967597 (pyruvate kinase), and 66346698 (NAGLU). The result was a score between 0 (deleterious) and 1 (neutral) for each possible amino acid substitution within the sequence. SIFT scores were treated as the fraction of the remaining protein function over the wild-type function of the protein (0 means 0% and 1 means 100% function).

PolyPhen2 predictions were obtained using the default parameters (HumDiv classifier) of the batch query tab of PolyPhen2 server (http://genetics.bwh.harvard.edu/pph2/), where we provided the NP identifier of the query protein, the protein residue number, and the wild type and substitute amino acids. We used the NP identifiers of NP_000062 (CBS), NP_000068 (p16), NP_003336 (SUMO ligase), NP_870986 (pyruvate kinase), and NP_000254 (NAGLU). We used the “pph2_prob” value as the prediction score, which ranges between 0 (neutral) and 1 (deleterious), to scale it between 0% and 100% loss of the wild-type function of the protein.

2.3 Statistical tests

2.3.1 Area under the curve of receiver operating characteristic

The area under the curve of the receiver operating characteristic (ROC) plots were calculated using in-house algorithms. The experimental values were transformed to binary values (0 or 1). Typically, the cutoff value was set to 50% of the wild-type protein function, whereas for the p16 challenge we used a cutoff of 75 (experimental values ranged from 50 to 100), as suggested by the bimodal distributions of the experimental values. Multiple cutoffs were also studied when the challenge provided a sufficient number of experimental values of experimental values.

2.3.2 Pearson's correlation coefficient

It was calculated using the built-in function of Microsoft Office Excel.

3 RESULTS

The EA method to estimate the functional and clinical impact of missense mutations was evaluated in five CAGI challenges. In each one, we tested whether the experimental and the predicted values were correlated linearly, or through a more complex dependence, by plotting the average experimental values as a function of the EA scores and the average EA scores as a function of the experimental values. In order to calculate these relationships, we binned the data, often by every 20 or 10 variants when the dataset had more or less than 200 variants. Datasets with less than 20 variants were not binned, whereas coarse binning was used when the experimental values were unevenly distributed. Then, we presented the independent and unbiased assessment of the performance of each submitted prediction, according to the summary of the CAGI assessor. Finally, we presented two widely used statistical tests (the ROC curves and Pearson's correlation coefficient test), as calculated by the authors, in order to provide common ground on comparing performances across different CAGI challenges. We also applied these two tests on predictions from the two most cited mutation impact prediction methods, PolyPhen2 (Adzhubei et al., 2010) and SIFT (Ng & Henikoff, 2001).

3.1 SUMO ligase (CAGI 4 - 2016)

A large library of missense mutations in human SUMO ligase was assessed for competitive growth in a high-throughput yeast-based complementation assay, by the laboratory of Professor F. Roth at University of Toronto (Weile et al., in preparation). The challenge was to predict the effect of mutations on ligase activity, experimentally determined by the change in fractional representation of each mutant clone in the competitive yeast growth assay relative to wild-type clones. Specifically, predictors were asked to submit scores between 0 (no growth) and 1 (wild-type growth) for detrimental mutations, and more than 1 for mutants with better than wild-type growth. Data were divided into three subsets of mutants. Subset 1 contained 219 single amino acid variants, each represented by at least three independent barcoded clones and therefore they were assessed with high accuracy (each barcoded clone represented an individual mutant yeast strain). Subset 2 contained 463 additional single amino acid variants, each represented by fewer than three independent barcoded clones. Subset 3 contained 4,427 alleles corresponding to clones containing two or more amino acid variants.

The EA submission (one prediction attempt) treated EA scores as fitness differences. A priori, these differences were assumed to be mostly detrimental, consistent with the nearly neutral theory of molecular evolution (Ohta, 1992). To account for gain-of-function (GOF) variants, however, we then hypothesized that substitute amino acids seen more often than the wild-type amino acid in the homolog sequences alignment could be beneficial, so we assigned negative (“not detrimental”) sign of EA scores for those variants. Since EA scores vary between 0 (wild type) and 100 (loss of function), the activity of SUMO ligase mutants we submitted to CAGI was: submitEA = 1-EA/100. Next, to combine the effect of multiple mutations (M1, M2, …, MN) on the same allele, we multiplied the effect of each mutation, as: submitEA = (1-EAM1/100)·(1-EAM2/100)·…·(1-EAMN/100). When we plotted the average EA scores for bins of 20 variants with similar experimental growth scores (Fig. 1A), we noted that (1) GOF variants had similar EA scores to variants with nearly wild-type activity, (2) variants with experimental growth score between 0 and 1 showed a good correlation with EA scores, and (3) variants with negative experimental growth scores had lower EA impact than variants with zero growth scores, suggesting that these variants may have some activity against the function measured by the assay. On the other hand, when we plotted the average experimental growth scores for decile bins of EA scores (Fig. 1B), we noted linear correlations for each subset, the best of which was for the subset 1 (R² = 0.88) that had the highest experimental growth accuracy. This correlation was consistent to the correlations in E. coli lac repressor (Markiewicz, Kleina, Cruz, Ehret, & Miller, 1994), HIV-1 protease (Loeb et al., 1989), and human p53 (Kato et al., 2003) mutations, that were used to validate the performance of EA (Katsonis & Lichtarge, 2014).

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

SUMO ligase: Competitive growth of 5,109 alleles in a high-throughput yeast-based complementation assay. A: The average EA score for the 682 alleles that each carries a single amino acid variant (subsets 1 and 2), in groups of 20 alleles with similar growth scores. The error bars note the standard error of the mean. B: The average competitive growth score of the SUMO ligase alleles, in deciles of EA score. The data were divided in three subsets, according to the CAGI 4 challenge description. Subset 1 was the high-accuracy subset of 219 single amino acid variants for which at least three independent barcoded clones were represented. Subset 2 was the remaining 463 single amino acid variants. Subset 3 was 4,427 alleles corresponding to clones containing two or more amino acid variants. The error bars note the standard error of the mean. C: The performance of the 16 submitted predictions, according to the overall score calculated by the CAGI assessor. The assessor calculated 54 primary scores for each submission, which included Kendall rank, Spearman's rank, Pearson's, and Matthews correlation coefficients, F-score, value differences, RMSD, and ROC, among others, for the three subsets. The assessor integrated those scores to rank the original predictions, the ranks of the predictions, and a transformation guided by experimental values for each of the three subsets. The CAGI assessor calculated the overall score based on these nine values. D: The Pearson's correlation coefficients for each subset, calculated by the authors. In addition to the submitted predictions that are shown as colored bars, we calculated Pearson's correlation coefficients for the methods PolyPhen2 and SIFT, which are shown as dashed lines. We did not use PolyPhen2 and SIFT in subset 3, since integrating the effect of multiple mutations per allele is ambiguous and submissions followed different approaches. E: The AUC as function of the maximum experimental standard error, for all 5,109 alleles (subsets 1, 2, and 3). Variants were divided into deleterious and neutral if they had competitive growth scores less and more than 50. F: The AUC as function of the threshold of the competitive growth score to separate deleterious and neutral variants, for each subset. The dashed lines correspond to predictions from PolyPhen2 and SIFT. The lines of the plots in (E) and (F) were colored according to the colors of the bars in (C) and (D)

The CAGI assessor for this challenge carefully examined the results by using 18 different assessment metrics to compare the performance of the submissions in each subset. These metrics measured correlations of the experimental growth scores with (1) the original prediction values, (2) the ranks of the predicted values, and (3) a transformation guided by experimental values for the submitted values. Then, as an overall assessment, the CAGI assessor calculated an integrative score for each of these groups of tests and for each data subset, which yielded an overall sum and an overall rank for each method. By this process to define an overall performance score, EA ranked at the top (Fig. 1C). To be clear, the difference between EA and the second best method was small and not necessarily significant. However, all the other methods relied on machine learning and training sets, whereas EA used only the EA equation. Moreover, EA was the only submission with a better overall performance score than a simple conservation-based model developed by the CAGI assessor as a standard of success. To better understand performance, we calculated the Pearson's correlation coefficient and the ROC curves. EA's Pearson's correlation coefficients were only 0.39, 0.38, and 0.26 for the subsets 1, 2, and 3, respectively (Fig. 1D). But these were the best in each data set, including compared with SIFT and PolyPhen2 (which did not participate in the challenge). We note that the area under the ROC curve (AUC) for EA in the three subsets of this challenge was 0.73, 0.72, and 0.70, respectively, for experimental value cutoff of 0.5. These AUC values were also better than the other prediction methods, but they were below the AUC of EA in other datasets (Katsonis & Lichtarge, 2014). To understand this discrepancy, we tested whether the low performance in the ROC metric could be due to experimental uncertainty (Gallion et al., 2017). Indeed, when we restricted the analysis to only account for alleles, in any subset, that had low standard error (SE < 0.05) in the experiments, the AUC rose dramatically, reaching up to AUC of 0.9 (Fig. 1E). We also calculated the AUC of all predictions for nine different thresholds of the growth scores, between 0 and 1. For single mutants, the AUC of most methods increased for low thresholds, suggesting that the computational prediction could separate the partial-function variants from nonfunctional variants better than from the variants with wild-type activity (Fig. 1F). For the multivariant alleles of the subset 3, the cutoff of 0.5 appears to be optimum in separating functional from nonfunctional variants for most submitted predictions.

3.2 Pyruvate kinase (CAGI 4 - 2016)

A large set of amino acid changing mutations of the pyruvate kinase had been assayed in E. coli extracts for their effect on the enzymatic activity and the allosteric regulation of the liver isozyme (L-PYK), by the laboratory of Professor Aron W. Fenton at University of Kansas Medical Center. One subchallenge was to predict the effect of mutations on L-PYK enzyme activity, which was measured as a binary assay result (0, inactive; 1, active). A second subchallenge was to predict the ratios of equilibrium constants for the inhibition of the enzyme by alanine and of the activation of the enzyme by fructose 1,6 bisphosphate. While the first subchallenge is directly relevant to predictions made by EA, addressing the second challenge may require computational analysis beyond the scope of EA. Therefore, here, we focus on predicting the enzymatic activity of L-PYK. Data were split into two experiment subsets: (1) 113 substitutions in nine residue positions, and (2) 430 alanine-scanning mutations.

We used EA to address the enzymatic activity of L-PYK and we submitted one prediction file. EA scores vary between 0 (wild type) and 100 (loss of function), so we treated EA as the probability for a variant to be inactive and we submitted scores calculated as: submitEA = 100-EA. The average activity predicted by EA for active and inactive variants was 30 versus 54 for subset 1 (Mann–Whitney U P value = 7∙10⁻⁶) and 34 versus 59 for subset 2 (Mann–Whitney U P value = 10⁻⁸), respectively (Fig. 2A). When we binned every 20 mutants with similar EA scores, we noted that variants with EA-predicted activity of more than half of the wild-type activity were active in their vast majority, whereas for the rest bins the fraction of active mutants changed almost linearly with the EA prediction (Fig. 2B). This dependence is similar to that of the T4 lysozyme dataset, which we had attributed to sensitivity of the experimental assay (Katsonis & Lichtarge, 2014).

The CAGI assessor of this challenge used the balanced accuracy (BACC) metric to compare the performance of the submitted predictions, for each experimental set. The BACC is given by the average of sensitivity (true-positive rate) and specificity (true-negative rate), which require to set a cutoff for the submitted predictions. The CAGI assessor tested either using as cutoff the value of 0.5, or they calculated the optimum cutoff for each method. For the EA submitted prediction, the value of 0.5 was found to be the optimum cutoff. For each of the two experimental sets, the CAGI assessor found that EA had the top performance according to BACC, even when they optimized the cutoff for the other submitted predictions (Fig. 2C). We reached ourselves the same conclusion when we calculated the AUC of ROC, where EA had AUC of 0.8 and 0.76 in the two subsets, respectively, which were higher than the AUC values of the other submitted predictions as well as of SIFT and PolyPhen2, which did not participate in the challenge (Fig. 2D).

3.3 NAGLU (CAGI 4 - 2016)

The enzymatic activity of NAGLU for 165 missense mutations, which were exclusively found in the ExAC dataset (Lek et al., 2016), was assessed as the percentage of the wild-type NAGLU activity by BioMarin Pharmaceutical, Inc. The challenge was to predict NAGLU activity, submitting scores between 0 (no activity) and 1 (wild-type level of activity), or higher than 1, when the mutation effect was predicted to be detrimental or beneficial. Similar to the pyruvate kinase challenge, we used EA and we submitted one prediction file with scores calculated as: submitEA = 1-EA/100. When we binned every 10 variants with similar enzymatic activity, small enzymatic activities (less than half of the wild type) correlated with the average EA prediction values, but large enzymatic activities (more than half of the wild type) had similar EA scores (Fig. 3A). On the other hand, when we binned every 10 variants with similar EA scores, the average enzymatic activity correlated linearly (R² = 0.90) with EA scores (Fig. 3B).

The CAGI assessor of the NAGLU challenge used three tests to compare the performance of the submissions, the root-mean-square deviation (RMSD), the Pearson product-moment correlation coefficient, and the Spearman's rank correlation coefficient. The assessor did not use the well-established ROC test in their overall rank calculation, because they found that the top five submissions, which included EA, had essentially identical performance with AUC values slightly greater than 0.8. In the overall rank, the assessor included only the best performing submission from each predictor group when a group submitted multiple versions, due to redundancy. According to this overall rank, EA had the third best performance (Fig. 3C). The Pearson's correlation coefficient of EA was 0.54, which was the second highest and better than SIFT and PolyPhen2 (Fig. 3D). We also calculated the AUC of ROC for nine different threshold values of the enzymatic activity between 0 and 1 (Fig. 3E). Most predictions had their maximum AUC at small thresholds, where EA did particularly well with AUC of 0.86 for the threshold of enzymatic activity at 0.3, which was the highest AUC value achieved by any prediction method at any cutoff. We also tested the ROC performance when the analysis was limited to variants with small experimental standard deviations (77 variants had SD below 0.05). Indeed, there was improvement in AUC for almost all predictions, but only for large enzymatic activity thresholds, since variants with low enzymatic activity very often had low standard deviations (Fig. 3F). EA reached a maximum AUC of 0.93 for the threshold of enzymatic activity at 0.9, suggesting strong performance when the experimental measurements were very consistent. The facts that no method had consistently the best AUC of ROC at each threshold and that the relative ranks changed when the analysis was restricted to variants with consistent experimental measurements, support the conclusion of the CAGI assessor that the AUC performance was indistinguishable for the top performing methods.

3.4 P16 (CAGI 3 - 2013)

The ability of 10 p16 variants (CDKN2A gene) to block cell proliferation was tested by Maria Chiara Scaini, at Veneto Institute of Oncology of Padova (Scaini et al., 2014). The p16 variants, like the controls of wild-type p16 (negative) and EGFP vector (positive), were expressed in a p16-null-human osteosarcoma cell line and their proliferation rate was recorded for 9 days. The challenge was to predict the proliferation rate of each p16 mutant cell line relative to the positive control, given that the proliferation rate of the wild-type p16 cells was approximately 50% of the proliferation rate of the positive control cells. We used EA to estimate the impact of p16 mutations, and then we predicted that the proliferation rate would be: submit_EA = 50+EA/2. Although the correlation of the experimental and predicted values was very strong, with a Pearson's r of 0.84 (Fig. 4A), the formula we used to calculate the proliferation rate from the EA scores was subpar. Setting submit_EA = EA would have yielded a much better agreement to the experimentally measured values.

The CAGI assessor of this challenge used four tests to compare the performance of the submitted predictions: Kendall's tau coefficient (τ), RMSD, ROC, and overlap within 10%. The CAGI assessor calculated an overall score from the average rank of these four tests, where EA ranked second out of 22 submissions (Fig. 4B). Of note is that the best submission came from a machine learning method trained with evolutionary and structural features, but the same research group submitted three additional predictions on the same challenge that were trained on different combination of features, and these submissions had intermediate or very poor performance. The EA submission had higher Pearson's correlation coefficient (r = 0.84) than the other submissions and than SIFT and PolyPhen2 (Fig. 4C). Also, the EA submission had perfect ROC, with AUC = 1 in separating the proliferation rate of the variants that had a bimodal distribution (Fig. 4D). The poor performance of SIFT and PolyPhen2 in this challenge was due to predicting maximum impact for almost all of these variants. The best performance of EA on Pearson's coefficient and ROC metrics was consistent between our calculations and the calculations of the CAGI assessor.

3.5 CBS (CAGI 2 - 2011)

The functionality of 84 single amino acid CBS variants, found in homocystinuria patients, was tested in an in vivo yeast complementation assay, by the laboratory of Professor J. Rine, at UC Berkeley. The human CBS clone was expressed and functionally complemented in yeast cells that had the orthologous yeast gene CYS4 removed from the chromosome. In that assay, growth was dependent upon the level of mutant human CBS function, and the rates were expressed as a percentage relative to wild-type (human protein) growth. Two concentrations of pyridoxine, high (400 ng/ml) and low (2 ng/ml), were used. The challenge was to submit predictions on the effect of the variants in the function of CBS in both cofactor concentrations. To address this challenge, we used EA to estimate the loss of CBS activity. At high cofactor concentration, we simply set: submitEA = 100-EA. At low cofactor concentration, we scaled the EA scores to yield lower CBS activities, guided by the test data, such that an EA of 70 will yield 10% CBS activity instead of 30% (linear scaling without changing the extremes, so EA of 0 and 100 will still yield 100% and 0% CBS activity, respectively). Since most CBS variants were found to be experimentally inactive, we binned the variants into those with 0%, 0%–50%, 50%–100%, and more than 100% of the wild-type activity. As expected, the average EA score was higher for the bins of the higher relative growth rate (Fig. 5A). On the other hand, binning every 10 CBS variants by their EA scores yielded strong linear correlations between growth rate and EA (Fig. 5B; R² was 0.87 and 0.93 for high and low cofactor concentration, respectively).

The CAGI assessor of this challenge used nine different tests for each subset of high and low pyridoxine concentration to compare the performance of the submitted predictions, including precision, recall, accuracy, RMSD, Spearman's rank correlation coefficient, F-score, and ROC, among others. Out of 20 submissions, the EA submission had the best performance in nine of the 18 tests, including those of accuracy, RMSD, and F-score, in both datasets. EA was also the best method according to the average rank of all 18 metrics used by the CAGI assessor (Fig. 5C). According to our calculations of the Pearson's correlation coefficients, the EA predictions were the best and the second best method at low and high cofactor concentrations, respectively (Fig. 5D). According to our calculation of the ROC test, EA was the second best method at both cofactor concentrations, with only a marginal difference from the top method (Fig. 5E). When the analysis was restricted to variants with low thresholds of standard deviation, to our surprise, the AUC for almost all predictions dropped, suggesting that lower standard deviations do not imply more accurate experimental measurements in this particular data set (Fig. 5F).

4 DISCUSSION

Following objective assessments across diverse challenges, these data demonstrate that the EA is a robust, state-of-the-art method to estimate the mutational harm of protein-coding variations with consistent tendency to perform best, or nearly so. Out of the five CAGI challenges EA participated in predicting the impact of genetic variations, three times EA was ranked as the top submission as measured by overall score or by the average rank of metrics chosen by the independent CAGI assessors. The other two times, EA ranked as second and third best out of 16 submitted predictions per challenge, on average. Of note, the CAGI challenges were very competitive, with many submissions performing better than PolyPhen2 and SIFT, which are well-known methods, routinely used in the literature to estimate the impact of genetic variations, but which did not participate in the recent CAGI contests. The typical ROC performance of EA was AUC higher than 0.8. An EA AUC below 0.8 seemed to associate with experimental inaccuracies. Conversely, highly accurate experimental data were associated with AUC values above 0.9. This is consistent with the view that experimental gold standards can themselves be fraught with uncertainties, as discussed elsewhere (Gallion et al., 2017).

Whether a method achieves a top ranking or not may often be overinterpreted. Of equal or greater value is whether a method adds orthogonal information and techniques that enrich the domain. In that respect, it is critical to stress that EA is far different from other submissions. It follows a compact and simple mathematical logic, lifted directly from elementary calculus. In so doing it factors in homology and phylogenetic information, explicitly. It sets its parameters (the magnitude of a substitution) over the evolutionary history of all proteins, and reflects the specific protein and residue of interest through the evolutionary gradient, which is computed through a set algorithm and requires no training. Still, EA tended to perform on par or better than most machine learning approaches. These approaches, in contrast to EA, were trained on mutation data, structural stability information, physicochemical properties, and functional site annotation (e.g., known functional motifs, interaction sites, and allosteric sites) in addition to homology data. Moreover, many machine-learning approaches trained to further integrate the combined outputs from many stand-alone mutation impact predictors.

The good performance of the EA equation in CAGI challenges therefore supports the fundamental hypotheses underlying the EA theory. That is, genotype–phenotype evolution in the fitness landscape may be described by a fundamental differential equation, reminiscent of those seen in physics. This is surprising for many reasons. Clearly, the genetic code changes discretely, not smoothly. Also, far from being “infinitesimal,” some mutations bring a heavy toll on patients. More broadly, EA hinges on an evolutionary function f that is never explicitly defined. Lastly, EA is an essentially untrained expression that apparently is unaware of important details, such as a protein's structure, functions, or interactions. Despite this, CAGI objectively shows through its blind contests assessed by independent judges that EA is an effective, accurate, robust, and generally that performs on par or better than sophisticated and powerful statistical and artificial intelligence techniques trained on large data sets.

These apparent paradoxes can be resolved, however, by examining the formal variables EA uses. First, EA models the impact of genetic variations—the central feature of evolution—by applying basic calculus to Sewall's fitness landscape: a mutation causes a fitness displacement equal to the perturbation size times the local fitness sensitivity, that is, the gradient of the mutated position. To estimate this gradient, EA looks at evolutionary history: when this position varied among species, did their fitness change much or little? The answer is taken directly from evolutionary trees, or more accurately from the fundamental equivalence between sequence distances and fitness distance between their species (Lichtarge et al., 1996). Thus, only the evolutionary tree and its record of distances between sequences are needed to estimate f’; f itself is never required. Critically, f’ implicitly accounts for structural, dynamical, functional, and other interaction constraints that guide fitness response to point mutations. Although no statistical training is present, the gradient f’ is specific to each protein and its context. Finally, the EA equation reflects the perturbation on the species, since the evolutionary tree comparisons are between species. As such, a mutation that is deadly to an individual is in fact absent from the evolutionary records of species. By the same token, discrete mutations among individuals become melded into a slow continuous diffusion process along an evolutionary trajectory over geological time scale.

EA currently uses only first-order terms, which were approximated by terms that imply the context. The higher-order terms of the EA equation would account for the epistatic interactions of the residues within a protein or across different proteins, and they may well improve predictions, so residue coupling information would be a valuable improvement in future developments of computing EA (Marks et al., 2011). However, for now, the first differential term of the evolutionary equation by itself has broad practical applications in identifying key functional determinants with which to predict, redesign, or mimic function and specificity (Amin et al., 2013; Rodriguez et al., 2010; Yao et al., 2003). Now, when used as part of the EA equation, it helps interpret the impact of genetic variations to prioritize mutations (Mullany et al., 2015; Rababa'h et al., 2013), assess the quality of exomic data (Koire, Katsonis, & Lichtarge, 2016), stratify head and neck cancer patient outcome (Neskey et al., 2015), and predict their response to treatment (Osman et al., 2015a; Osman et al., 2015b).

In summary, CAGI is an important community exercise that objectively compares and illustrates the relative contribution of diverse methods to interpret mutations. In that light, it appears that the performance of EA is as good on prospective datasets as it was on retrospective datasets (Katsonis & Lichtarge, 2014). Arguably, strengths of the EA approach are its simplicity combined with its generality. That is, EA is not trained, but rather relies on first principles of protein evolution. As such, EA differs profoundly from other CAGI submissions and leading methods to evaluate mutations. Moreover, it is widely applicable to any proteins, since it is impervious to differences between de novo mutations and polymorphisms, to the eukaryotic, prokaryotic, or viral origin of the proteins, and to the enzymatic or multifunctional proteins. As with all homology-based methods, the number and diversity of the available homologous sequences necessary to build a sufficiently deep evolutionary tree remain a limitation, as is the absence, for now, of the second-order terms in the computation of EA. However, the mathematical framework of EA is universal and robustly recognizes the telltale patterns of evolutionary constraints. This robustness, which was shown in the CAGI contests, should make EA, and the associated server, a valuable tool for the functional and clinical interpretation of genetic variations.

ACKNOWLEDGMENTS

An EA server is available at http://mammoth.bcm.tmc.edu/EvolutionaryAction.

Disclosure statement

The authors declare no competing financial interests.

REFERENCES

Adzhubei, I., Schmidt, S., Peshkin, L., Ramensky, V., Gerasimova, A., Bork, P., … Sunyaev, S. (2010). A method and server for predicting damaging missense mutations. Nature Methods, 7(4), 248–249.
10.1038/nmeth0410-248
CAS PubMed Web of Science® Google Scholar
Amin, S., Erdin, S., Ward, R., Lua, R., & Lichtarge, O. (2013). Prediction and experimental validation of enzyme substrate specificity in protein structures. PNAS, 110(45): E4195–E4202.
10.1073/pnas.1305162110
CAS PubMed Web of Science® Google Scholar
Breen, M. S., Kemena, C., Vlasov, P. K., Notredame, C., & Kondrashov, F. A. (2012). Epistasis as the primary factor in molecular evolution. Nature, 490(7421), 535–538.
10.1038/nature11510
CAS PubMed Web of Science® Google Scholar
Bromberg, Y., & Rost, B. (2007). SNAP: Predict effect of non-synonymous polymorphisms on function. Nucleic Acids Research, 35(11), 3823–3835.
10.1093/nar/gkm238
CAS PubMed Web of Science® Google Scholar
Buckland, M., & Gey, F. (1994). The relationship between recall and precision. Journal of the American Society for Information Science, 45(1), 12–19.
10.1002/(SICI)1097-4571(199401)45:1<12::AID-ASI2>3.0.CO;2-L
Web of Science® Google Scholar
Capriotti, E., Calabrese, R., Fariselli, P., Martelli, P. L., Altman, R. B., & Casadio, R. (2013). WS-SNPs&GO: A web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics, 14(Suppl 3), S6.
10.1186/1471-2164-14-S3-S6
PubMed Web of Science® Google Scholar
Cardoso, J. G., Andersen, M. R., Herrgård, M. J., & Sonnenschein, N. (2015). Analysis of genetic variation and potential applications in genome-scale metabolic modeling. Frontiers in Bioengineering and Biotechnology, 3, 13.
10.3389/fbioe.2015.00013
PubMed Web of Science® Google Scholar
Carter, H., Douville, C., Stenson, P. D., Cooper, D. N., & Karchin, R. (2013). Identifying Mendelian disease genes with the variant effect scoring tool. BMC Genomics, 14(Suppl 3), S3.
10.1186/1471-2164-14-S3-S3
PubMed Web of Science® Google Scholar
Castellana, S., & Mazza, T. (2013). Congruency in the prediction of pathogenic missense mutations: State-of-the-art web-based tools. Briefings in Bioinformatics, 14(4), 448–459.
10.1093/bib/bbt013
CAS PubMed Web of Science® Google Scholar
Chan, P. A., Duraisamy, S., Miller, P. J., Newell, J. A., McBride, C., Bond, J. P., … Grimm, A. J. (2007). Interpreting missense variants: Comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR). Human Mutation, 28(7), 683–693.
10.1002/humu.20492
CAS PubMed Web of Science® Google Scholar
Choi, M., Scholl, U. I., Ji, W., Liu, T., Tikhonova, I. R., Zumbo, P., … Sanjad, S. (2009). Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proceedings of the National Academy of Sciences of the United States of America, 106(45), 19096–19101.
10.1073/pnas.0910672106
CAS PubMed Web of Science® Google Scholar
Choi, Y., Sims, G. E., Murphy, S., Miller, J. R., & Chan, A. P. (2012). Predicting the functional effect of amino acid substitutions and indels. PLoS One, 7(10), e46688.
10.1371/journal.pone.0046688
CAS PubMed Web of Science® Google Scholar
Coyne, J. A., & Orr, H. A. (1998). The evolutionary genetics of speciation. Philosophical Transactions of the Royal Society of London Series B: Biological Sciences, 353(1366), 287–305.
10.1098/rstb.1998.0210
CAS PubMed Web of Science® Google Scholar
Erdin, S., Ward, R. M., Venner, E., & Lichtarge, O. (2010). Evolutionary trace annotation of protein function in the structural proteome. Journal of Molecular Biology, 396(5), 1451–1473.
10.1016/j.jmb.2009.12.037
CAS PubMed Web of Science® Google Scholar
Fariselli, P., Martelli, P. L., Savojardo, C., & Casadio, R. (2015). INPS: Predicting the impact of non-synonymous variations on protein stability from sequence. Bioinformatics, 31(17), 2816–2821.
10.1093/bioinformatics/btv291
CAS PubMed Web of Science® Google Scholar
Fisher, R. A. 1930. The genetical theory of natural selection: A complete variorum edition. Oxford, UK: Oxford University Press.
Google Scholar
Flanagan, S. E., Patch, A-M., & Ellard, S. (2010). Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genetic Testing and Molecular Biomarkers, 14(4), 533–537.
10.1089/gtmb.2010.0036
CAS PubMed Web of Science® Google Scholar
Gallion, J., Koire, A., Katsonis, P., Schoenegge, A. M., Bouvier, M., & Lichtarge, O. (2017). Predicting phenotype from genotype: Improving accuracy through more robust experimental and computational modeling. Human Mutation, 38(5), 569–580.
10.1002/humu.23193
CAS PubMed Web of Science® Google Scholar
Henikoff, S., & Henikoff, J. (1992). Amino acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences of the United States of America, 89(22), 10915–10919.
10.1073/pnas.89.22.10915
CAS PubMed Web of Science® Google Scholar
Hicks, S., Wheeler, D. A., Plon, S. E., & Kimmel, M. (2011). Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Human Mutation, 32(6), 661–668.
10.1002/humu.21490
CAS PubMed Web of Science® Google Scholar
Jordan, D., Ramensky, V., & Sunyaev, S. (2010). Human allelic variation: Perspective from protein function, structure, and evolution. Current Opinion in Structural Biology, 20(3), 342–350.
10.1016/j.sbi.2010.03.006
CAS PubMed Web of Science® Google Scholar
Karchin, R., Diekhans, M., Kelly, L., Thomas, D. J., Pieper, U., Eswar, N., … Sali, A. (2005). LS-SNP: Large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics, 21(12), 2814–2820.
10.1093/bioinformatics/bti442
CAS PubMed Web of Science® Google Scholar
Kato, S., Han, S-Y., Liu, W., Otsuka, K., Shibata, H., Kanamaru, R., & Ishioka, C. (2003). Understanding the function–structure and function–mutation relationships of p53 tumor suppressor protein by high-resolution missense mutation analysis. Proceedings of the National Academy of Sciences of the United States of America, 100(14), 8424–8429.
10.1073/pnas.1431692100
CAS PubMed Web of Science® Google Scholar
Katsonis, P., Koire, A., Wilson, S. J., Hsu, T. K., Lua, R. C., Wilkins, A. D., & Lichtarge, O. (2014). Single nucleotide variations: Biological impact and theoretical interpretation. Protein Science, 23(12), 1650–1666.
10.1002/pro.2552
CAS PubMed Web of Science® Google Scholar
Katsonis, P., & Lichtarge, O. (2014). A formal perturbation equation between genotype and phenotype determines the Evolutionary Action of protein-coding variations on fitness. Genome Research, 24(12), 2050–2058.
10.1101/gr.176214.114
CAS PubMed Web of Science® Google Scholar
Kircher, M., Witten, D. M., Jain, P., O'Roak, B. J., Cooper, G. M., & Shendure, J. (2014). A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics, 46(3), 310–315.
10.1038/ng.2892
CAS PubMed Web of Science® Google Scholar
Koire, A., Katsonis, P., & Lichtarge, O. (2016). Repurposing germline exomes of the cancer genome atlas demands a cautious approach and sample-specific variant filtering. Pacific Symposium on Biocomputing, 21, 207–218.
PubMed Google Scholar
Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., … Exome Aggregation Consortium. (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature, 536(7616), 285–291.
10.1038/nature19057
CAS PubMed Web of Science® Google Scholar
Li, B., Krishnan, V. G., Mort, M. E., Xin, F., Kamati, K. K., Cooper, D. N., … Radivojac, P. (2009). Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics, 25(21), 2744–2750.
10.1093/bioinformatics/btp528
CAS PubMed Web of Science® Google Scholar
Lichtarge, O., Bourne, H., & Cohen, F. (1996). An evolutionary trace method defines binding surfaces common to protein families. Journal of Molecular Biology, 257(2), 342–358.
10.1006/jmbi.1996.0167
CAS PubMed Web of Science® Google Scholar
Lichtarge, O., & Wilkins, A. (2010). Evolution: A guide to perturb protein function and networks. Current Opinion in Structural Biology, 20(3), 351–359.
10.1016/j.sbi.2010.04.002
CAS PubMed Web of Science® Google Scholar
Liu, X., Jian, X., & Boerwinkle, E. (2011). dbNSFP: A lightweight database of human nonsynonymous SNPs and their functional predictions. Human Mutation, 32(8), 894–899.
10.1002/humu.21517
CAS PubMed Web of Science® Google Scholar
Loeb, D., Swanstrom, R., Everitt, L., Manchester, M., Stamper, S., & Hutchison, C. (1989). Complete mutagenesis of the HIV-1 protease. Nature, 340(6232), 397–400.
10.1038/340397a0
CAS PubMed Web of Science® Google Scholar
Marini, N. J., Thomas, P. D., & Rine, J. (2010). The use of orthologous sequences to predict the impact of amino acid substitutions on protein function. PLoS Genetics, 6(5), e1000968.
10.1371/journal.pgen.1000968
PubMed Web of Science® Google Scholar
Markiewicz, P., Kleina, L., Cruz, C., Ehret, S., & Miller, J. (1994). Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as “spacers” which do not require a specific sequence. Journal of Molecular Biology, 240(5), 421–433.
10.1006/jmbi.1994.1458
CAS PubMed Web of Science® Google Scholar
Marks, D. S., Colwell, L. J., Sheridan, R., Hopf, T. A., Pagnani, A., Zecchina, R., & Sander, C. (2011). Protein 3D structure computed from evolutionary sequence variation. PLoS One, 6(12), e28766.
10.1371/journal.pone.0028766
CAS PubMed Web of Science® Google Scholar
Mihalek, I., Res, I., & Lichtarge, O. (2004). A family of evolution-entropy hybrid methods for ranking protein residues by importance. Journal of Molecular Biology, 336(5), 1265–1282.
10.1016/j.jmb.2003.12.078
CAS PubMed Web of Science® Google Scholar
Miosge, L. A., Field, M. A., Sontani, Y., Cho, V., Johnson, S., Palkova, A., … Lyon, S. (2015). Comparison of predicted and actual consequences of missense mutations. Proceedings of the National Academy of Sciences of the United States of America, 112(37), E5189–E5198.
10.1073/pnas.1511585112
CAS PubMed Web of Science® Google Scholar
Mullany, L. K., Wong, K-K., Marciano, D. C., Katsonis, P., King-Crane, E. R., Ren, Y. A., … Richards, J. S. (2015). Specific TP53 mutants overrepresented in ovarian cancer impact CNV, TP53 activity, responses to nutlin-3a, and cell survival. Neoplasia, 17(10), 789–803.
10.1016/j.neo.2015.10.003
CAS PubMed Web of Science® Google Scholar
Neskey, D. M., Osman, A. A., Ow, T. J., Katsonis, P., McDonald, T., Hicks, S. C., … Patel, A. (2015). Evolutionary action score of TP53 identifies high-risk mutations associated with decreased survival and increased distant metastases in head and neck cancer. Cancer Research, 75(7), 1527–1536.
10.1158/0008-5472.CAN-14-2735
CAS PubMed Web of Science® Google Scholar
Ng, P., & Henikoff, S. (2001). Predicting deleterious amino acid substitutions. Genome Research, 11(5), 863–874.
10.1101/gr.176601
CAS PubMed Web of Science® Google Scholar
Niroula, A., Urolagin, S., & Vihinen, M. (2015). PON-P2: Prediction method for fast and reliable identification of harmful variants. PLoS One, 10(2), e0117380.
10.1371/journal.pone.0117380
PubMed Web of Science® Google Scholar
Ohta, T. (1992). The nearly neutral theory of molecular evolution. Annual Review of Ecology and Systematics, 23, 263–286.
10.1146/annurev.es.23.110192.001403
Web of Science® Google Scholar
Orr, H. A. (2005). The genetic theory of adaptation: A brief history. Nature Reviews Genetics, 6(2), 119–127.
10.1038/nrg1523
CAS PubMed Web of Science® Google Scholar
Osman, A. A., Monroe, M. M., Alves, M. V. O., Patel, A. A., Katsonis, P., Fitzgerald, A. L., … Caulin, C. (2015a). Wee-1 kinase inhibition overcomes cisplatin resistance associated with high-risk TP53 mutations in head and neck cancer through mitotic arrest followed by senescence. Molecular Cancer Therapeutics, 14(2), 608–619.
10.1158/1535-7163.MCT-14-0735-T
CAS PubMed Web of Science® Google Scholar
Osman, A. A., Neskey, D. M., Katsonis, P., Patel, A. A., Ward, A. M., Hsu, T.-K., … Alves, M. O. (2015b). Evolutionary action score of TP53 coding variants is predictive of platinum response in head and neck cancer patients. Cancer Research, 75(7), 1205–1215.
10.1158/0008-5472.CAN-14-2729
CAS PubMed Web of Science® Google Scholar
Overington, J., Donnelly, D., Johnson, M. S., Sali, A., & Blundell, T. L. (1992). Environment-specific amino acid substitution tables: Tertiary templates and prediction of protein folds. Protein Science, 1(2), 216–226.
10.1002/pro.5560010203
CAS PubMed Web of Science® Google Scholar
Rababa'h, A., Craft, J. W., Wijaya, C. S., Atrooz, F., Fan, Q., Singh, S., … McConnell, B. K. (2013). Protein kinase A and phosphodiesterase-4D3 binding to coding polymorphisms of cardiac muscle anchoring protein (mAKAP). Journal of Molecular Biology, 425(18), 3277–3288.
10.1016/j.jmb.2013.06.014
CAS PubMed Web of Science® Google Scholar
Reva, B., Antipin, Y., & Sander, C. (2007). Determinants of protein function revealed by combinatorial entropy optimization. Genome Biology, 8(11), R232.
10.1186/gb-2007-8-11-r232
CAS PubMed Web of Science® Google Scholar
Reva, B., Antipin, Y., & Sander, C. (2011). Predicting the functional impact of protein mutations: Application to cancer genomics. Nucleic Acids Research, 39(17), e118.
10.1093/nar/gkr407
CAS PubMed Web of Science® Google Scholar
Rodriguez, G., Yao, R., Lichtarge, O., & Wensel, T. (2010). Evolution-guided discovery and recoding of allosteric pathway specificity determinants in psychoactive bioamine receptors. Proceedings of the National Academy of Sciences of the United States of America, 107(17), 7787–7792.
10.1073/pnas.0914877107
CAS PubMed Web of Science® Google Scholar
Scaini, M. C., Minervini, G., Elefanti, L., Ghiorzo, P., Pastorino, L., Tognazzo, S., … Bianchi-Scarrà, G. (2014). CDKN2A unclassified variants in familial malignant melanoma: Combining functional and computational approaches for their assessment. Human Mutation, 35(7), 828–840.
10.1002/humu.22550
CAS PubMed Web of Science® Google Scholar
Schwarz, J. M., Cooper, D. N., Schuelke, M., & Seelow, D. (2014). MutationTaster2: Mutation prediction for the deep-sequencing age. Nature Methods, 11(4), 361–362.
10.1038/nmeth.2890
CAS PubMed Web of Science® Google Scholar
Smith, J. M. (1970). Natural selection and the concept of a protein space. Nature, 225(5232), 563–564.
10.1038/225563a0
CAS PubMed Web of Science® Google Scholar
Stone, E. A., & Sidow, A. (2005). Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Research, 15(7), 978–986.
10.1101/gr.3804205
CAS PubMed Web of Science® Google Scholar
Tchernitchko, D., Goossens, M., & Wajcman, H. (2004). In silico prediction of the deleterious effect of a mutation: Proceed with caution in clinical genetics. Clinical Chemistry, 50(11), 1974–1978.
10.1373/clinchem.2004.036053
CAS PubMed Web of Science® Google Scholar
Walters-Sen, L. C., Hashimoto, S., Thrush, D. L., Reshmi, S., Gastier-Foster, J. M., Astbury, C., & Pyatt, R. E. (2015). Variability in pathogenicity prediction programs: Impact on clinical diagnostics. Molecular Genetics & Genomic Medicine, 3(2), 99–110.
10.1002/mgg3.116
PubMed Web of Science® Google Scholar
Ward, R. M., Venner, E., Daines, B., Murray, S., Erdin, S., Kristensen, D. M., & Lichtarge, O. (2009). Evolutionary Trace Annotation Server: Automated enzyme function prediction in protein structures using 3D templates. Bioinformatics, 25(11), 1426–1427.
10.1093/bioinformatics/btp160
CAS PubMed Web of Science® Google Scholar
Wei, Q., Xu, Q., & Dunbrack, R. L. (2013). Prediction of phenotypes of missense mutations in human proteins from biological assemblies. Proteins: Structure, Function, and Bioinformatics, 81(2), 199–213.
10.1002/prot.24176
CAS PubMed Web of Science® Google Scholar
Weile, J., Cote, A. G., Sun, S., Knapp, J., Verby, M., Yang, F., … Roth, F. P. An atlas of functional amino acid changes in human SUMO and SUMO ligase. (In preparation).
Google Scholar
Wright, S. 1932. The roles of mutation, inbreeding, crossbreeding, and selection in evolution. Proceedings of the VI International Congress of Genetrics. Vol. 1, Ithaca, New York: Genetics Society of America, 356–366.
Google Scholar
Yao, H., Kristensen, D., Mihalek, I., Sowa, M., Shaw, C., Kimmel, M., … Lichtarge, O. (2003). An accurate, sensitive, and scalable method to identify functional sites in protein structures. Journal of Molecular Biology, 326(1), 255–261.
10.1016/S0022-2836(02)01336-0
CAS PubMed Web of Science® Google Scholar
Yue, P., & Moult, J. (2006). Identification and analysis of deleterious human SNPs. Journal of Molecular Biology, 356(5), 1263–1274.
10.1016/j.jmb.2005.12.025
CAS PubMed Web of Science® Google Scholar

Citing Literature

All articles

Objective assessment of the evolutionary action equation for the fitness effect of missense mutations across CAGI-blinded contests

Abstract