Volume 33, Issue 4 pp. 642-650

Informatics

Free Access

Classification of mismatch repair gene missense variants with PON-MMR^†

Heidi Ali,

Heidi Ali

Institute of Biomedical Technology, FI-33014 University of Tampere, Finland, and BioMeditech, Tampere, Finland

Search for more papers by this author

Ayodeji Olatubosun,

Ayodeji Olatubosun

Institute of Biomedical Technology, FI-33014 University of Tampere, Finland, and BioMeditech, Tampere, Finland

Search for more papers by this author

Mauno Vihinen,

Corresponding Author

Mauno Vihinen

[email protected]

Institute of Biomedical Technology, FI-33014 University of Tampere, Finland, and BioMeditech, Tampere, Finland

Department of Experimental Medical Science, Lund University, Sweden

Department of Experimental Medical Science, Lund University, SwedenSearch for more papers by this author

Heidi Ali,

Heidi Ali

Institute of Biomedical Technology, FI-33014 University of Tampere, Finland, and BioMeditech, Tampere, Finland

Search for more papers by this author

Ayodeji Olatubosun,

Ayodeji Olatubosun

Institute of Biomedical Technology, FI-33014 University of Tampere, Finland, and BioMeditech, Tampere, Finland

Search for more papers by this author

Mauno Vihinen,

Corresponding Author

Mauno Vihinen

[email protected]

Institute of Biomedical Technology, FI-33014 University of Tampere, Finland, and BioMeditech, Tampere, Finland

Department of Experimental Medical Science, Lund University, Sweden

Department of Experimental Medical Science, Lund University, SwedenSearch for more papers by this author

First published: 30 January 2012

https://doi.org/10.1002/humu.22038

Citations: 26

^†

Communicated by A. Jamie Cuticchia

Share a link

Email
Wechat
Bluesky

Abstract

Numerous mismatch repair (MMR) gene variants have been identified in Lynch syndrome and other cancer patients, but knowledge about their pathogenicity is frequently missing. The diagnosis and treatment of patients would benefit from knowing which variants are disease related. Bioinformatic approaches are well suited to the problem and can handle large numbers of cases. Functional effects were revealed based on literature for 168 MMR missense variants. Performance of numerous prediction methods was tested with this dataset. Among the tested tools, only the results of tolerance prediction methods correlated to functional information, however, with poor performance. Therefore, a novel consensus-based predictor was developed. The novel prediction method, pathogenic-or-not mismatch repair (PON-MMR), achieved accuracy of 0.87 and Matthews correlation coefficient of 0.77 on the experimentally verified variants. When applied to 616 MMR cases with unknown effects, 81 missense variants were predicted to be pathogenic and 167 neutral. With PON-MMR, the number of MMR missense variants with unknown effect was reduced by classifying a large number of cases as likely pathogenic or benign. The results can be used, for example, to prioritize cases for experimental studies and assist in the classification of cases. Hum Mutat 33:642–650, 2012. © 2012 Wiley Periodicals, Inc.

Introduction

Lynch syndrome or hereditary nonpolyposis colorectal cancer (HNPCC) accounts for approximately 2–5% of colorectal cancers [Hampel et al., 2008; Lynch et al., 2009]. The patients are exposed in addition to colorectal cancer to some extracolonic cancers (endometrium, stomach, ovary, kidney, urinary tract, biliary tract, small intestine, brain, and skin tumors). The syndrome is caused by germline mutations in mismatch repair (MMR) genes. These genes are MLH1 (MIM# 120436), MLH3 (MIM# 604395), MSH2 (MIM# 609309), MSH6 (MIM# 600678), PMS1 (MIM# 600258), PMS2 (MIM# 600259), or TFGBR2 (MIM# 190182). The role of PMS1, TGFBR2, and MLH3 in Lynch syndrome is still elusive. MMR is an evolutionary conserved DNA repair system that recognizes and repairs base–base mispairs and insertion–deletion loops arising during DNA replication and recombination. MMR malfunction affects DNA stability, which can result in microsatellite instability.

Thousands of MMR variants have been identified and stored to databases including InSiGHT (http://www.insight-group.org) and MMR Gene Unclassified Variants (http://mmruv.info/), but the relevance to cancer has been verified just in a small number of cases. Even for experimentally studied cases, the situation may be confusing, for example, R217C variant in MLH1 has been classified as pathogenic [Fan et al., 2007], neutral [Takahashi et al., 2007; Trojan et al., 2002], and as having unknown effect [Ellison et al., 2001]. In addition to experimental methods, the pathogenicity of a variant can be predicted with bioinformatic methods [Thusberg and Vihinen, 2009]. Bioinformatic predictors provide valuable information faster, easier, and cheaper than laboratory methods.

Experts in the field have organized to International Agency for Research on Cancer (IARC) unclassified genetic variant working group to establish standards for the classification of variants, including the terminology, evaluation, and validation of data [Tavtigian et al., 2008]. IARC has suggested a five-tier classification system [Plon et al., 2008] based on the probability of being pathogenic derived from clinical, genetic, in vitro, in vivo, and in silico information. Only a small number of MMR variants have been classified so far. The most extensive effort for MMR genes and proteins is taken by the InSiGHT Interpretation Committee; however, results have not yet been published.

We developed a dedicated prediction tool for MMR missense variants and applied it to analyze 616 unclassified variants (UVs). We reduced the number of UVs substantially by classifying 81 MMR missense variants as disease related and 167 as neutral. The results can be utilized to prioritize variants for further experimental validation and diagnosis of Lynch syndrome and other cancers together with clinical and other information.

Materials and Methods

MMR Missense Variants

Altogether 784 MMR missense variants for Lynch syndrome patients were downloaded (January 27, 2011) from the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) database at http://www.InSiGHT-group.org. The unique MMR variants were distributed to five MMR proteins as follows: MLH1 (287), MLH3 (18), MSH2 (226), MSH6 (156), and PMS2 (97).

Functional effects were used as the signs of the pathogenicity of the variants. Information about functional assays was searched from literature. The experimentally verified functional effects of MMR missense variants were collected from articles. The most widely applied methods in these studies included in vitro MMR activity [Christensen et al., 2009; Drost et al., 2010; Jäger et al., 2001; Kansikas et al., 2011; Kariola et al., 2004; Korhonen et al., 2008; Nyström-Lahti et al., 2002; Ollila et al., 2006a, b; Raevaara et al., 2004, 2005; Takahashi et al., 2007; Trojan et al., 2002]. Additional methods were in vivo DNA MMR assays in yeast [Ellison et al., 2001], yeast two-hybrid system [Fan et al., 2007; Ou et al., 2009], and RNA expression [Pagenstecher et al., 2006].

Some variants had been studied several times and if the reports disagreed on the effect, the conclusions of the latest, most extensive, consistent, and systematic studies of Kansikas et al. [Kansikas et al., 2011] and Takahashi et al. [Takahashi et al., 2007] were utilized. With the cases investigated by Kansikas et al., special attention was given to MMR activity, microsatellite instability, expression, and localization. Cases for which at least two methods agreed were classified as disease causing or tolerated. Variants analyzed by Takahashi et al. were grouped based on in vitro MMR activity by using 60% (as recommended by the authors) as a threshold. In their study, gene expression values varied too much to be informative and correlation with dominant mutation effect was so poor that the enzyme activity was only reliable information type similar to the other studies, where experimental results were used as the basis for the variant classification.

Studies of Kansikas et al. and Takahashi et al. unanimously agreed on the definition of all the 11 overlapping variants. The results of all predictive tools were excluded as unreliable and because their use would have been circuitous in a case of a novel prediction tool. Altogether, data was available for 168 functionally tested MMR missense variants, out of which 80 were pathogenic. This dataset had 123 variants in MLH1, 11 in MLH3, 27 in MSH2, and 7 in MSH6 protein. The remaining 616 unclassified MMR missense variants were distributed to proteins as follows: 164 for MLH1, 7 for MLH3, 199 for MSH2, 149 for MSH6, and 97 for PMS2. There were no missense variants in PMS1 and TFGBR2.

Prediction of Pathogenicity

Pathogenic-or-not-pipeline (PON-P) [Olatubosun et al., 2012] at http://bioinf.uta.fi/PON-P was utilized for the submission, prediction, and analysis of protein sequences and MMR missense variants with various bioinformatic prediction methods. Variant tolerance prediction methods included Mutation Taster [Schwarz et al., 2010], MutPred [Li et al., 2009], nsSNPAnalyzer [Bao et al., 2005], PhD-SNP [Capriotti et al., 2006], PMut [Ferrer-Costa et al., 2005], PolyPhen2 [Adzhubei et al., 2010], SIFT [Ng and Henikoff, 2003], SNAP [Bromberg and Rost, 2007], and SNPs&GO [Calabrese et al., 2009]. Sequence-based stability effect predictions were performed with SCPRED [Dosztányi et al., 1997], MUPRO [Cheng et al., 2006], and I-Mutant 3.0 [Capriotti et al., 2005], and structure-based predictions with SCide (stabilization centers) [Dosztanyi et al., 2003] and SRide (stabilizing residues) [Magyar et al., 2005] for MSH2 and MSH6 variants.

Structural disorder was predicted with MetaPrDOS [Ishida and Kinoshita, 2008], PrDOS [Ishida and Kinoshita, 2007], DISORPED2 [Ward et al., 2004], DisEMBL [Linding et al., 2003], DISPROT (VSL2P) [Peng et al., 2006], DISpro [Cheng et al., 2005], IUpred [Dosztanyi et al., 2005], and POODLE-S [Shimizu et al., 2007].

All the variants were entered to protein aggregation predictors Aggrescan [Conchillo-Sole et al., 2007] and Waltz [Oliveberg, 2010]. The interatomic contacts of variants in MSH2 and MSH6 protein structure were checked with CMA (Contact Map Analysis) [Sobolev et al., 2005], CSU (Contacts of Structural Units) [Sobolev et al., 1999], and RankViaContact [Shen and Vihinen, 2003].

The default parameters were utilized in all the prediction methods, and only the protein sequence and MMR missense variant were provided as input. Blastp [Altschul et al., 1997] was used to search for homologous sequences in NCBI nonredundant sequence database for all the MMR proteins. Multiple sequence alignments containing only full-length sequences were obtained with ClustalW [Chenna et al., 2003]. We selected sequences only with known functions and removed putative or hypothetical sequences. Conservation for each variant position in sequence alignment was determined with PAM250 and Blosum 62 amino acid substitution matrices.

Quality Parameters for Tolerance Prediction Methods

The quality of the tolerance prediction methods was measured by six parameters: Precision (or positive predictive value, PPV), negative predictive value (NPV), specificity, sensitivity, accuracy, and Matthews correlation coefficient (MCC) as follows:

where TP (true positive) is the number of positive (disease related) cases that were correctly predicted, TN (true negative) is the number of negative (benign) cases correctly predicted, FP (false positive) is the number of negative cases incorrectly predicted, and the FN (false negative) is the number of positive cases incorrectly predicted.

In order to be able to compare various methods with the different numbers of predicted cases, the numbers of negative cases were normalized to be equal with those for positive cases.

Novel Classifier

To harness the power of multiple prediction methods, a new consensus predictor was developed to identify variants that are highly likely to be pathogenic, neutral, or of unknown pathogenicity status. Outputs were combined from five tolerance predictors: PhD-SNP, PolyPhen2, SIFT, SNAP, and SNPs&GO. For each predictor, a weight is calculated based on its accuracy as follows:

where the weight and accuracy of predictor i are w_i and acc_i, respectively. This weight-derivation formulation has previously been applied by Opitz and Shavlik [Opitz and Shavlik, 1996]. The accuracy of each program was evaluated on the set of variants with known pathogenicity status.

To utilize all the information provided by the predictors, the reliability output from each method was scaled from zero to one. PhD-SNP, SNAP, and SNPs&GO provide in addition to the predicted class, the reliability of the prediction. For these methods, the pathogenicity score was calculated as

The pathogenicity score for PolyPhen2 was set to 0 for benign predictions and 1 for pathogenic predictions.

The pathogenicity scores were formulated such that the higher the reliability and probability of a prediction, the closer the pathogenicity score approaches 1 for pathogenic predictions, or 0 for neutral predictions. Lower reliability or probability induces the pathogenicity score to approach 0.5 in both cases.

Based on the pathogenicity score (ps_i) and the weights (w_i), a consensus prediction was computed:

The upper and lower cutoff values were established such that variants on the evaluation set having pathogenicity score greater than the upper cutoff value 0.7615 are classified as pathogenic, those having scores lower than the lower cutoff value 0.351 are classified as neutral, and those in-between left unclassified.

Structural Effects of MSH2 and MSH6 Missense Variants

The effects of MSH2 and MSH6 missense variants were studied based on the structure of the heterodimer in PDB entry 2O8B [Warren et al., 2007]. Recognition of secondary structural elements in proteins was done with STRIDE [Heinig and Frishman, 2004] and visualization with program Pymol [Schrödinger, 2010].

Results

Our aim was to group previously unclassified MMR missense variants as pathogenic or neutral. To do this, we first investigated the suitability of a wide spectrum of prediction methods, in total 30 programs, to classify experimentally verified MMR variants. After finding deficiencies in prediction performance, we developed a novel classifier.

Testing Prediction Method Performance with Known MMR Missense Variants

We retrieved 168 experimentally verified MMR missense variants with known functional effect from the literature (Table 1) of which 80 were pathogenic and 88 neutral. The variants have highly biased distribution in the MMR proteins. MLH1 contains the majority (123 cases, 73%) of the variants.

Table 1. MLH1, MLH3, MSH2, and MSH6 Variants with Experimentally Verified Functional Effects

The dataset of cases with functional information was utilized to test the suitability of a large number of bioinformatic prediction methods. The distinct prediction method categories included tolerance, stability, disorder, aggregation, interatomic contacts, and sequence conservation. Of these, only the tolerance prediction methods demonstrated correlation to experimental results and thus were employed in subsequent studies.

The performance of the tolerance prediction methods, as analyzed with six quality measures, is displayed in Table 2. The best individual method measured by accuracy (0.8) and MCC (0.61) is nsSNPAnalyzer followed by SNPs&GO, which has the highest precision (0.83) and specificity (0.86). Mutation Taster has relatively low accuracy (0.63) and MCC (0.37), but the best sensitivity (0.98) and NPV (0.93) values. None of the individual methods can provide highly accurate results alone.

Table 2. Performance of the Tolerance Prediction Programs with 168 MMR Missense Variants with Known Functional Effects

	Mutation Taster	MutPred	nsSNPAnalyzer	PhD-SNP	PolyPhen	SIFT	SNAP	SNPs&GO
TP	80	77	67	71	75	70	73	59
FP	59	50	20	25	43	45	55	12
TN	25	36	50	61	43	41	31	74
FN	2	4	9	11	7	12	3	23
Cases P/N^a	82/84	81/86	76/70	82/86	82/86	82/86	76/86	82/86
Total number^b	166	167	146	168	168	168	162	168
Accuracy^c	0.63	0.68	0.80	0.79	0.70	0.66	0.64	0.79
Precision^c	0.58	0.61	0.77	0.74	0.64	0.61	0.57	0.83
Specificity^c	0.30	0.42	0.71	0.71	0.50	0.48	0.36	0.86
Sensitivity^c	0.98	0.95	0.88	0.87	0.91	0.85	0.96	0.72
NPV^c	0.93	0.90	0.85	0.85	0.86	0.77	0.91	0.76
MCC^c	0.37	0.43	0.61	0.58	0.45	0.36	0.39	0.59

^aNumber of experimentally verified pathogenic (P) and neutral (N) cases predicted by the program.
^bTotal number of cases predicted by the program.
^cCalculated from normalized numbers.

MMR Missense Variant Classification by Consensus Predictor

As only tolerance prediction methods correlated with the experimental MMR missense variant effects, we utilized them to develop our own method. For that purpose, we combined the predictions of five tolerance predictors: PhD-SNP, PolyPhen2, SIFT, SNAP, and SNPs&GO. We introduced pathogenicity score that is calculated from the classifications of individual classifiers and the reliability of these predictions. The cutoff values of the consensus predictor were optimized to be 0.351 and 0.7615. The optimized consensus predictor has improved accuracy (0.87), precision (0.81), specificity (0.77), sensitivity (0.97), NPV (0.65), and MCC (0.77) in comparison with the individual methods when testing with 95 variants for which it gave prediction pathogenic or neutral out of total 162 training variants as all the utilized programs could not predict the outcome of all the 168 cases.

The new predictor was used to classify the dataset of 616 variants with unknown effect. Predictions with high score were obtained for 248 variants (40.3%) of which 81 were predicted to be pathogenic and 167 neutral (Table 3). The MMR consensus classifier called PON-MMR (http://bioinf.uta.fi/PON-MMR) is freely available as part of the PON-P service.

Table 3. Predicted Pathogenic and Neutral MMR Missense Variants

Pathogenic				Neutral
MLH1	MSH2	MSH6	PMS2	MLH1	MLH3	MSH2	MSH6	PMS2
A21E	Y43C	L435P	E705K	I32V	V420I	T8M	A20V	A182T
R27P	D49V	G566R	S815L	E53A	V741F	A72L	N21S	S445T
N38K	L93P	C765W	C843Y	S95A	P844L	V102I	A25S	P446S
G67E	N127I	L792P		R127K	V971I	R106K	P42S	S455A
G98R	L173R	C1158R		L135V		M141V	G54A	I462L
G98S	L175P			L166F		A189S	S65L	I462M
G101D	L310P			V213A		G203R	A81T	I462T
G101S	L310R			V213L		A207S	L147H	V467G
S106R	G338R			E320D		I216V	K185E	L468F
V113D	R359S			A353V		I237V	K187T	L468V
Y126N	L387P			T364A		K248E	E221D	R469I
G147R	Y408C			H381Y		N331D	N223S	P470S
I216S	L421P			L400V		P336S	I251V	E473V
L260R	L440P			K416E		V342I	T269S	S477F
L272S	V470E			D418E		N361S	G289D	H479Q
V303E	R524L			P435L		L390F	S315F	T485K
V384D	R524P			G454R		Q419K	A326V	D502E
A539D	R534C			M458K		T441P	F340S	I508L
Q542P	D603G			S459L		D487E	R361H	D510E
L559R	D603Y			K461N		G508S	R378K	T511A
F568I	H639Y			N468D		N547S	L396V	Y519C
L622P	C641G			D485H		S554T	I425V	A520V
G634R	G669R			R487Q		E561K	S532A	S523T
L636P	G669D			P496R		T564A	K610N	D526E
P640L	G669S			E515K		M592V	P623A	P540T
P640T	G669V			R522Q		N596S	R644S	N554H
F656S	P670L			E578G		Q629R	I669T	L571I
R659L	N671Y			A623S		A636V	E675D	A572T
W666R	G674R			N635S		T682A	Q698E	T573S
C680R	G674S			N645S		I770V	I725M	K581E
R725H	G683R			V647M		T803A	R761K	E583K
L749P	L687P			E668K		T807S	A787V	L585I
L749Q	M688R			L724M		N835H	V800A	S587D
	Q690E					S860L	V800L	S587T
	G692R					A870G	D803G	I590L
	G692V					T905I	P831A	L594F
	C697R					T905R	V878A	L594V
	D748Y					I930M	I886V	T597S
	G751R						F985L	M600I
	G827R						I1054F	M600L
							P1073R	I629L
							P1073S	E635N
							P1082L
							Y1128C
							E1163V
							M1202T
							E1254D
							R1304K
							E1310D
							S1329L

Features of Pathogenic and Neutral Missense Variants

The distributions of the mutated (original) and mutant amino acids in the functionally verified set of 168 cases are biased both for pathogenic and neutral MMR missense variants. Among the pathogenic variants (Supp. Table S1), glycine and leucine occur more frequently in the original amino acid residues, whereas arginine and proline are overrepresented among the mutant residues. Alanine and isoleucine appear in excess among neutral variants in the original amino acids, while threonine and valine overrepresent in the mutant residues.

Pathogenic MMR variants have more substitutions from leucine to proline (12 cases) and glycine to arginine (11 cases) while neutral MMR variants have more substitutions from isoleucine to valine (7 cases) and asparagine to serine (7 cases). The numbers are too small for statistical analysis; however, they are in line with general variation distribution [Thusberg et al., 2011].

Structural Effects of MSH2 and MSH6 Missense Variants

We were able to inspect the structural effects of MMR missense variants only on MSH2 and MSH6, because protein three-dimensional (3D) structures are known just for these two proteins. We investigated the effects of the predicted pathogenic and neutral variants based on the protein dimer structure and paid attention to the location of the original residue on the protein surface or core, localization in secondary structures, possible sterical clashes of the substituted amino acid side chains, and effects on electrostatistics.

Altogether, we studied 109 variants of which 63 neutral ones were considered not to substantially affect the structure, for example, due to conservative substitutions, appearing on the protein surface. One of the MSH2 variants, N547S was predicted to be neutral although it participates in DNA binding and an alteration in it would be pathogenic. We concluded that at least 42 of 45 pathogenic variants (93%) may have serious effect, due to the introduction of structural strain, decreasing stability, missing interchain interactions or changing the DNA binding cleft (Fig. 1).

Details are in the caption following the image — **Figure 1**
Open in figure viewer PowerPoint

(A) MSH2-MSH6 protein dimer in PDB entry 2O8B with the positions of variants colored. MSH2 is in cyan and MSH6 in green. Variants predicted to be pathogenic are in red and neutral variants in yellow. Structure includes in addition a stretch of double stranded DNA in red. Examples of variation effects: (B) Variation of Y408 (green) to cysteine is likely harmful because the ionic interaction with E455 (yellow) in another α-helix is removed. (C) Substitution R524P (green) is considered as pathogenic because of the structure alteration and prevention of DNA recognition. (D) G692R (green) substitution appears in a tight turn. There is not sufficient space to fit the extended arginine side chain.

The locations of the predicted neutral and pathogenic variants, and some examples of effects are illustrated in the MSH2–MSH6 complex structure (Fig. 1). The structure is for a truncated version of MSH6, and thus only variants after sequence position 362 are visualized. As both chains contained some gaps, nine additional variants could not be studied at structural level.

Discussion

We classified MMR missense variants into pathogenic and neutral cases by utilizing a novel consensus predictor. First, we tested the performance of altogether 30 predictors in several categories including tolerance, stability, disorder, aggregation, interatomic contacts, and sequence conservation with 168 experimentally verified MMR variants. Only tolerance methods correlated with variant severity (i.e., pathogenicity). The methods had significant performance differences, for example, MCC varied from 0.36 to 0.61. The best individual method proved to be nsSNPAnalyzer; however, its performance was not considered sufficient. The novel method builds a consensus from the output of five tolerance methods and their reliability estimates. This method utilizes results from PhD-SNP, PolyPhen2, SIFT, SNAP, and SNPs&GO and classifies the variants as pathogenic, neutral, or UV. We did not include nsSNPAnalyzer in the new predictor as it cannot predict many of the variants due to missing 3D structure data for some of the MMR proteins in the ASTRAL database it uses. Previous studies indicated that the performance of tolerance [Thusberg et al., 2011] and protein stability [Khan and Vihinen, 2010] predictions vary significantly. With the new method, we were able to classify 81 variants as pathogenic and 167 as neutral, 368 remaining UVs. To the best of our knowledge, this is the largest bioinformatic effort to classify MMR missense variants.

The residue distribution among pathogenic and neutral MMR variants is biased. Residue alterations in the pathogenic variants include many substitutions to proline, which are generally pathogenic, because proline is a known protein secondary structure breaker. The probable reason for the high number in arginines among the mutated pathogenic residues is that four out of six codons for this amino acid contain the highly mutable CpG dinucleotide, a known mutational hotspot [Ollila et al., 1996]. Arginine substitutions remove the functionally important basic side chain. Another enriched amino acid among the pathogenic variants was glycine, which as the smallest amino acid appears frequently in tight turns where it cannot be replaced by any other residues. The observed amino acid substitution trends are consistent with those in protein secondary structures [Khan and Vihinen, 2007] and among known disease and benign variations [Thusberg et al., 2011].

PON-MMR classifies variants with the pathogenicity score higher than the upper cutoff value 0.7615 to be pathogenic and lower than the cutoff value 0.351 to be neutral, and those in between remain unclassified. This consensus prediction is calculated from the reliability and the probability of the prediction. Thus, we could not use the strict classification system that IARC recommends [Plon et al., 2008] for these variants.

As an independent study of the quality of the predictions we investigated the effect on the protein structure of two proteins, MSH2 and MSH6, for which 3D structures have been determined. This study of MSH2–MSH6 complex supported the predictions for 105 out of 109 variants. In the case of remaining four variants, we could not draw conclusive decision for three of them and one appears in DNA-binding site based on the structure, information that is not available for the predictors.

Numerous MMR missense variants have been identified from Lynch syndrome patients and investigated with experimental methods. In addition to the functional studies of missense, insertion, and deletion variants [Pagenstecher et al., 2006; Kansikas et al., 2011], the consequences of splicing in MMR genes have been studied [Betz et al., 2010]. PON-MMR was developed only for missense variants and, therefore, does not take into account other kinds of variants such as nonsense substitutions or mRNA splicing effects.

Some MMR missense variants have been classified previously with bioinformatic methods. Doss and Sethumadhavan [Doss and Sethumadhavan, 2009] predicted 125 MMR missense variants with SIFT, PolyPhen, and PupaSuite. Out of these, SIFT classified 22 and PolyPhen 40 variants as pathogenic. In addition, PupaSuite predicted the protein activity effects. They investigated MSH2 and MSH6 variants further based on protein structure. Chan et al. [Chan et al., 2007] classified 28 MLH1 and 14 MSH2 variants with SIFT, PolyPhen, and A-GVGD. They did not note major differences in the performance of the methods. In silico methods can be applied for the priorization or evaluation of variants, for example, in whole-genome scans.

The effects of MLH1 variants that disturb the MLH1–PMS2 dimerization have been analyzed by examining protein expression, dimerization, MMR activity, and bioinformatic predictions [Kosinski et al., 2010]. Of 19 MLH1 variants, they classified 15 as pathogenic and 4 as UVs. Due to controversial results in literature, three variants, which they predicted to be pathogenic, were neutral in our evaluation data set. We based the classifications on the extensive functional data (for details, see section “Materials and Methods”). Six variants, which they predicted to be pathogenic, agree with our evaluation set. They predicted L749P and R755W to be pathogenic, while we classified them as UVs. Three variants, UV in their classification, were part of our neutral evaluation set. We both classified the variant D601G as UV. One of their variants was not a missense variant.

Chao et al. [Chao et al., 2008] have developed a classification system for MMR variants called MAPP-MMR. We used our evaluation set to estimate MAPP-MMR, which has been trained only with 24 pathogenic and 26 neutral variants. We used 138 cases, not used for training with which the PON-MMR cutoffs were optimized as the test set. MAPP-MMR had accuracy of 0.83, precision of 0.92, specificity of 0.88, sensitivity of 0.80, NPV of 0.71, and MCC of 0.65 being in performance between PON-MMR and the tolerance predictors.

We compared the performance of MAPP-MMR and PON-P with cases for which both methods provided prediction, either pathogenic or neutral. MAPP-MMR cannot predict all the instances in the dataset. Finally, there were 96 pathogenic or neutral variants in the test set. The methods agreed on the pathogenicity of 84 variants (45 were neutral and 39 pathogenic) of which 76 were correct predictions. All the cases predicted as pathogenic were correct, but 8 cases predicted as neutral although the functional classification indicated them to be disease associated. PON-P was somewhat better than MAPP-MMR with cases on which the methods disagreed, further it can predict all the test cases unlike MAPP-MMR. The user interface between both PON-P and PON-MMR allows the submission of more than one case at a time and does not require a manual picking of normal and variant amino acids as MAPP-MMR provided on a commercial site. Further, in comparison to MAPP-MMR, PON-P provides instructions and explanation for predictions, features that are missing from MAPP-MMR.

We sampled the performance of generic PON-P, which is not optimized for MMR variants, with the evaluation set. Unlike other methods, PON-P provides a reliability measure, which can be utilized for evaluating the output. When, the reliability parameter was increased from 0.90 to 0.99 the MCC increased from 0.63 to 0.79 indicating the good performance of the method. Still, the dedicated PON-MMR is better as expected for a tool optimized for these proteins.

In silico methods have already been used [Kansikas et al., 2011; Plon et al., 2008] in combination with other methods for classifying MMR variants. PON-MMR could be used in these and similar UV classification schemes as one of the criteria for pathogenicity. The growing number of variants poses a need for more reliable prediction methods.

The PON-MMR consensus predictor was applied to classify over 600 MMR variants. This prioritization allows experimental scientists to concentrate on the most likely cases to verify the results. Results from PON-MMR or any other predictor or experimental method should not be used as the only evidence for pathogenicity. According to recent recommendations at least two independent indications are needed to make diagnosis [Kohonen-Corish et al., 2010]. PON-MMR can be applied in to Lynch syndrome and other cancers where MMR variants are involved.

Supporting Information

References

Altschul S, Madden T, Schaffer A, Zhang J, Zhang Z, Miller W, Lipman D. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402.
10.1093/nar/25.17.3389
CAS PubMed Web of Science® Google Scholar
Adzhubei IA, Steffen S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. 2010. A method and server for predicting damaging missense mutations. Nat Methods 7: 248–249.
10.1038/nmeth0410-248
CAS PubMed Web of Science® Google Scholar
Bao L, Zhou M, Cui Y. 2005. nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res 33: W480–482.
10.1093/nar/gki372
CAS PubMed Web of Science® Google Scholar
Betz B, Theiss S, Aktas M, Konermann C, Goecke T, Möslein G, Schaal H, Royer-Pokora B. 2010. Comparative in silico analyses and experimental validation of novel splice site and missense mutations in the genes MLH1 and MSH2. J Cancer Res Clin Oncol 136: 123–134.
10.1007/s00432-009-0643-z
CAS PubMed Web of Science® Google Scholar
Bromberg Y, Rost B. 2007. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res 35: 3823–3835.
10.1093/nar/gkm238
CAS PubMed Web of Science® Google Scholar
Calabrese R, Capriotti E, Fariselli P, Martelli PL, Casadio R. 2009. Functional annotations improve the predictive score of human disease-related mutations in proteins. Hum Mutat 30: 1237–1244.
10.1002/humu.21047
CAS PubMed Web of Science® Google Scholar
Capriotti E, Calabrese R, Casadio R. 2006. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics 22: 2729–2734.
10.1093/bioinformatics/btl423
CAS PubMed Web of Science® Google Scholar
Capriotti E, Fariselli P, Casadio R. 2005. I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure. Nucleic Acids Res 33: W306–W310.
10.1093/nar/gki375
CAS PubMed Web of Science® Google Scholar
Chan PA, Duraisamy S, Miller PJ, Newell JA, McBride C, Bond JP, Raevaara T, Ollila S, Nyström M, Grimm AJ, Christodoulou J, Oetting WS, Greenblatt MS. 2007. Interpreting missense variants: comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR). Hum Mutat 28: 683–693.
10.1002/humu.20492
CAS PubMed Web of Science® Google Scholar
Chao EC, Velasquez JL, Witherspoon MSL, Rozek LS, Peel D, Ng P, Gruber SB, Watson P, Rennert G, Anton-Culver H, Lynch H, Lipkin SM. 2008. Accurate classification of MLH1/MSH2 missense variants with multivariate analysis of protein polymorphisms-mismatch repair (MAPP-MMR). Hum Mutat 29: 852–860.
10.1002/humu.20735
CAS PubMed Web of Science® Google Scholar
Cheng J, Randall A, Baldi P. 2006. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins 62: 1125–1132.
10.1002/prot.20810
CAS PubMed Web of Science® Google Scholar
Cheng J, Sweredoski M, Baldi P. 2005. Accurate prediction of protein disordered regions by mining protein structure data. Data Min Knowl Discov 11: 213–222.
10.1007/s10618-005-0001-y
Web of Science® Google Scholar
Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. 2003. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res 31: 3497–3500.
10.1093/nar/gkg500
CAS PubMed Web of Science® Google Scholar
Christensen L, Kariola R, Korhonen M, Wikman F, Sunde L, Gerdes A, Okkels H, Brandt C, Bernstein I, Hansen T, Hagemann-Madsen R, Andersen C, Nyström M, Ørntoft T. 2009. Functional characterization of rare missense mutations in MLH1 and MSH2 identified in Danish colorectal cancer patients. Fam Cancer 8: 489–500.
10.1007/s10689-009-9274-4
CAS PubMed Web of Science® Google Scholar
Conchillo-Sole O, de Groot N, Aviles F, Vendrell J, Daura X, Ventura S. 2007. AGGRESCAN: a server for the prediction and evaluation of “hot spots” of aggregation in polypeptides. BMC Bioinformatics 8: 65.
10.1186/1471-2105-8-65
CAS PubMed Web of Science® Google Scholar
Doss CG, Sethumadhavan R. 2009. Investigation on the role of nsSNPs in HNPCC genes—a bioinformatics approach. J Biomed Sci 16: 42.
10.1186/1423-0127-16-42
CAS PubMed Web of Science® Google Scholar
Dosztanyi Z, Csizmok V, Tompa P, Simon I. 2005. IUPred: Web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21: 3433–3434.
10.1093/bioinformatics/bti541
CAS PubMed Web of Science® Google Scholar
Dosztányi Z, Fiser A, Simon I. 1997. Stabilization centers in proteins: identification, characterization and predictions. J Mol Biol 272: 597–612.
10.1006/jmbi.1997.1242
CAS PubMed Web of Science® Google Scholar
Dosztányi Z, Magyar C, Tusnady G, Simon I. 2003. SCide: identification of stabilization centers in proteins. Bioinformatics 19: 899–900.
10.1093/bioinformatics/btg110
CAS PubMed Web of Science® Google Scholar
Drost M, Zonneveld JéB, van Dijk L, Morreau H, Tops CM, Vasen HF, Wijnen JT, de Wind N. 2010. A cell-free assay for the functional analysis of variants of the mismatch repair protein MLH1. Hum Mutat 31: 247–253.
10.1002/humu.21180
CAS PubMed Web of Science® Google Scholar
Ellison AR, Lofing J, Bitter GA. 2001. Functional analysis of human MLH1 and MSH2 missense variants and hybrid human-yeast MLH1 proteins in Saccharomyces cerevisiae. Hum Mol Genet 10: 1889–1900.
10.1093/hmg/10.18.1889
CAS PubMed Web of Science® Google Scholar
Fan Y, Wang W, Zhu M, Zhou J, Peng J, Xu L, Hua Z, Gao X, Wang Y. 2007. Analysis of hMLH1 missense mutations in East Asian patients with suspected hereditary nonpolyposis colorectal cancer. Clin Cancer Res 13: 7515–7521.
10.1158/1078-0432.CCR-07-1028
CAS PubMed Web of Science® Google Scholar
Ferrer-Costa C, Gelpi JL, Zamakola L, Parraga I, de la Cruz X, Orozco M. 2005. PMUT: a Web-based tool for the annotation of pathological mutations on proteins. Bioinformatics 21: 3176–3178.
10.1093/bioinformatics/bti486
CAS PubMed Web of Science® Google Scholar
Hampel H, Frankel WL, Martin E, Arnold M, Khanduja K, Kuebler P, Clendenning M, Sotamaa K, Prior T, Westman JA, Panescu J, Fix D, Lockman J, LaJeunesse J, Comeras I, de la Chapelle A. 2008. Feasibility of screening for Lynch syndrome among patients with colorectal cancer. J Clin Oncol 26: 5783–5788.
10.1200/JCO.2008.17.5950
PubMed Web of Science® Google Scholar
Heinig M, Frishman D. 2004. STRIDE: a Web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res 32: W500–502.
10.1093/nar/gkh429
CAS PubMed Web of Science® Google Scholar
Ishida T, Kinoshita K. 2007. PrDOS: prediction of disordered protein regions from amino acid sequence. Nucleic Acids Res 35: W460-W464.
10.1093/nar/gkm363
PubMed Web of Science® Google Scholar
Ishida T, Kinoshita K. 2008. Prediction of disordered regions in proteins based on the meta approach. Bioinformatics 24: 1344–1348.
10.1093/bioinformatics/btn195
CAS PubMed Web of Science® Google Scholar
Jäger AC, Rasmussen M, Bisgaard HC, Singh KK, Nielsen FC, Rasmussen LJ. 2001. HNPCC mutations in the human DNA mismatch repair gene hMLH1 influence assembly of hMutLalpha and hMLH1-hEXO1 complexes. Oncogene 20: 3590–3595.
10.1038/sj.onc.1204467
CAS PubMed Web of Science® Google Scholar
Kansikas M, Kariola R, Nyström M. 2011. Verification of the three-step model in assessing the pathogenicity of mismatch repair gene variants. Hum Mutat 32: 107–115.
10.1002/humu.21409
CAS PubMed Web of Science® Google Scholar
Kariola R, Hampel H, Frankel WL, Raevaara TE, de la Chapelle A, Nyström-Lahti M. 2004. MSH6 missense mutations are often associated with no or low cancer susceptibility. Br J Cancer 91: 1287–1292.
10.1038/sj.bjc.6602129
CAS PubMed Web of Science® Google Scholar
Khan S, Vihinen M. 2007. Spectrum of disease-causing mutations in protein secondary structures. BMC Struct Biol 7: 56.
10.1186/1472-6807-7-56
CAS PubMed Web of Science® Google Scholar
Khan S, Vihinen M. 2010. Performance of protein stability predictors. Hum Mutat 31: 675–684.
10.1002/humu.21242
CAS PubMed Web of Science® Google Scholar
Kohonen-Corish MRJ, Al-Aama JY, Auerbach AD, Axton M, Barash CI, Bernstein I, Béroud C, Burn J, Cunningham F, Cutting GR, den Dunnen JT, Greenblatt MS, Kaput J, Katz M, Lindblom A, Macrae F, Maglott D, Möslein G, Povey S, Ramesar R, Richards S, Seminara D, Sobrido M, Tavtigian S, Taylor G, Vihinen M, Winship I, Cotton RGH, on behalf of contributors to the Human Variome Project Meeting. 2010. How to catch all those mutations—the report of the third human variome project meeting, UNESCO Paris, May 2010. Hum Mutat 31: 1374–1381.
10.1002/humu.21379
PubMed Web of Science® Google Scholar
Korhonen MK, Vuorenmaa E, Nyström M. 2008. The first functional study of MLH3 mutations found in cancer patients. Genes Chromosomes Cancer 47: 803–809.
10.1002/gcc.20581
CAS PubMed Web of Science® Google Scholar
Kosinski J, Hinrichsen I, Bujnicki JM, Friedhoff P, Plotz G. 2010. Identification of Lynch syndrome mutations in the MLH1-PMS2 interface that disturb dimerization and mismatch repair. Hum Mutat 31: 975–982.
10.1002/humu.21301
CAS PubMed Web of Science® Google Scholar
Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, Mooney SD, Radivojac P. 2009. Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics 25: 2744–2750.
10.1093/bioinformatics/btp528
CAS PubMed Web of Science® Google Scholar
Linding R, Jensen LJ, Diella F, Bork P, Gibson TJ, Russell RB. 2003. Protein disorder prediction: implications for structural proteomics. Structure 11: 1453–1459.
10.1016/j.str.2003.10.002
CAS PubMed Web of Science® Google Scholar
Lynch HT, Lynch JF, Attard TA. 2009. Diagnosis and management of hereditary colorectal cancer syndromes: Lynch syndrome as a model. Can Med Assoc J 181: 273–280.
10.1503/cmaj.071574
PubMed Web of Science® Google Scholar
Magyar C, Gromiha MM, Pujadas G, Tusnady GE, Simon I. 2005. SRide: a server for identifying stabilizing residues in proteins. Nucleic Acids Res 33: W303–305.
10.1093/nar/gki409
CAS PubMed Web of Science® Google Scholar
Ng PC, Henikoff S. 2003. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 31: 3812–3814.
10.1093/nar/gkg509
CAS PubMed Web of Science® Google Scholar
Nyström-Lahti M, Perrera C, Räschle M, Panyushkina-Seiler E, Marra G, Curci A, Quaresima B, Costanzo F, D'Urso M, Venuta S, Jiricny J. 2002. Functional analysis of MLH1 mutations linked to hereditary nonpolyposis colon cancer. Genes Chromosomes Cancer 33: 160–167.
10.1002/gcc.1225
CAS PubMed Web of Science® Google Scholar
Olatubosun, et al. 2012. Submitted.
Google Scholar
Oliveberg M. 2010. Waltz, an exciting new move in amyloid prediction. Nat Methods 7: 187–188.
10.1038/nmeth0310-187
CAS PubMed Web of Science® Google Scholar
Ollila J, Lappalainen I, Vihinen M. 1996. Sequence specificity in CpG mutation hotspots. FEBS Lett 396: 119–122.
10.1016/0014-5793(96)01075-7
CAS PubMed Web of Science® Google Scholar
Ollila S, Fitzpatrick R, Sarantaus L, Kariola R, Ambus I, Velsher L, Hsieh E, Andersen MK, Raevaara TE, Gerdes A, Mangold E, Peltomäki P, Lynch HT, Nyström M. 2006a. The importance of functional testing in the genetic assessment of Muir–Torre syndrome, a clinical subphenotype of HNPCC. Int J Oncol 28: 149–153.
CAS PubMed Web of Science® Google Scholar
Ollila S, Sarantaus L, Kariola R, Chan P, Hampel H, Holinski-Feder E, Macrae F, Kohonen-Corish M, Gerdes A, Peltomäki P, Mangold E, de la Chapelle A, Greenblatt M, Nyström M. 2006b. Pathogenicity of MSH2 missense mutations is typically associated with impaired repair capability of the mutated protein. Gastroenterology 131: 1408–1417.
10.1053/j.gastro.2006.08.044
CAS PubMed Web of Science® Google Scholar
Opitz D, Shavlik J. 1996. Generating accurate and diverse members of a neural network ensemble. NIPS 8: 535–541.
Web of Science® Google Scholar
Ou J, Rasmussen M, Westers H, Andersen SD, Jäger PO, Kooi KA, Niessen RC, Eggen BJL, Nielsen FC, Kleibeuker JH, Sijmons RH, Rasmussen LJ, Hofstra RMW. 2009. Biochemical characterization of MLH3 missense mutations does not reveal an apparent role of MLH3 in Lynch syndrome. Genes Chromosomes Cancer 48: 340–350.
10.1002/gcc.20644
CAS PubMed Web of Science® Google Scholar
Pagenstecher C, Wehner M, Friedl W, Rahner N, Aretz S, Friedrichs N, Sengteller M, Henn W, Buettner R, Propping P, Mangold E. 2006. Aberrant splicing in MLH1 and MSH2 due to exonic and intronic variants. Hum Genet 119: 9–22.
10.1007/s00439-005-0107-8
CAS PubMed Web of Science® Google Scholar
Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. 2006. Length-dependent prediction of protein intrinsic disorder. BMC Bioinformatics 7: 208.
10.1186/1471-2105-7-208
PubMed Web of Science® Google Scholar
Plon SE, Eccles DM, Easton D, Foulkes WD, Genuardi M, Greenblatt MS, Hogervorst FBL, Hoogerbrugge N, Spurdle AB, Tavtigian SV. 2008. Sequence variant classification and reporting: recommendations for improving the interpretation of cancer susceptibility genetic test results. Hum Mutat 29: 1282–1291.
10.1002/humu.20880
CAS PubMed Web of Science® Google Scholar
Raevaara TE, Gerdes AM, Lönnqvist KE, Tybjærg-Hansen A, Abdel-Rahman WM, Kariola R, Peltomäki P, Nyström-Lahti M. 2004. HNPCC mutation MLH1 P648S makes the functional protein unstable, and homozygosity predisposes to mild neurofibromatosis type 1. Genes Chromosomes Cancer 40: 261–265.
10.1002/gcc.20040
CAS PubMed Web of Science® Google Scholar
Raevaara TE, Korhonen MK, Lohi H, Hampel H, Lynch E, Lönnqvist KE, Holinski-Feder E, Sutter C, McKinnon W, Duraisamy S, Gerdes A, Peltomäki P, Kohonen-Ccorish M, Mangold E, MacRae F, Greenblatt M, de la Chapelle A, Nyström M. 2005. Functional significance and clinical phenotype of nontruncating mismatch repair variants of MLH1. Gastroenterology 129: 537–549.
10.1053/j.gastro.2005.06.005
CAS PubMed Web of Science® Google Scholar
Schrödinger LLC. 2010. The PyMOL molecular graphics system, Version 1.3r1.
Google Scholar
Schwarz JM, Rodelsperger C, Schuelke M, Seelow D. 2010. MutationTaster evaluates disease-causing potential of sequence alterations. Nat Methods 7: 575–576.
10.1038/nmeth0810-575
CAS PubMed Web of Science® Google Scholar
Shen B, Vihinen M. 2003. RankViaContact: ranking and visualization of amino acid contacts. Bioinformatics 19: 2161–2162.
10.1093/bioinformatics/btg293
CAS PubMed Web of Science® Google Scholar
Shimizu K, Hirose S, Noguchi T. 2007. POODLE-S: Web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix. Bioinformatics 23: 2337–2338.
10.1093/bioinformatics/btm330
CAS PubMed Web of Science® Google Scholar
Sobolev V, Sorokine A, Prilusky J, Abola E, Edelman M. 1999. Automated analysis of interatomic contacts in proteins. Bioinformatics 15: 327–332.
10.1093/bioinformatics/15.4.327
CAS PubMed Web of Science® Google Scholar
Sobolev V, Eyal E, Gerzon S, Potapov V, Babor M, Prilusky J, Edelman M. 2005. SPACE: a suite of tools for protein structure prediction and analysis based on complementarity and environment. Nucleic Acids Res 33: W39–43.
10.1093/nar/gki398
CAS PubMed Web of Science® Google Scholar
Takahashi M, Shimodaira H, Andreutti-Zaugg C, Iggo R, Kolodner RD, Ishioka C. 2007. Functional analysis of human MLH1 variants using yeast and in vitro mismatch repair assays. Cancer Res 67: 4595–4604.
10.1158/0008-5472.CAN-06-3509
CAS PubMed Web of Science® Google Scholar
Tavtigian SV, Greenblatt MS, Goldgar DE, Boffetta P. 2008. Assessing pathogenicity: overview of results from the IARC unclassified genetic variants working group. Hum Mutat 29: 1261–1264.
10.1002/humu.20903
CAS PubMed Web of Science® Google Scholar
Thusberg J, Olatubosun A, Vihinen M. 2011. Performance of mutation pathogenicity prediction methods on missense variants. Hum Mutat 32: 358–368.
10.1002/humu.21445
CAS PubMed Web of Science® Google Scholar
Thusberg J, Vihinen M. 2009. Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum Mutat 30: 703–714.
10.1002/humu.20938
CAS PubMed Web of Science® Google Scholar
Trojan J, Zeuzem S, Randolph A, Hemmerle C, Brieger A, Raedle J, Plotz G, Jiricny J, Marra G. 2002. Functional analysis of hMLH1 variants and HNPCC-related mutations using a human expression system. Gastroenterology 122: 211–219.
10.1053/gast.2002.30296
CAS PubMed Web of Science® Google Scholar
Ward JJ, Sodhi JS, McGuffin LJ, Buxton BF, Jones DT. 2004. Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J Mol Biol 337: 635–645.
10.1016/j.jmb.2004.02.002
CAS PubMed Web of Science® Google Scholar
Warren JJ, Pohlhaus TJ, Changela A, Iyer RR, Modrich PL, Beese L. 2007. Structure of the human MutSα DNA lesion recognition complex. Mol Cell 26: 579–592.
10.1016/j.molcel.2007.04.018
CAS PubMed Web of Science® Google Scholar

Citing Literature

All articles

Classification of mismatch repair gene missense variants with PON-MMR^†

Abstract

Introduction