A parameterized, continuum electrostatic model for predicting protein pKa values
Steven K. Burger
Department of Chemistry and Chemical Biology, McMaster University, 1280 Main St. West, Hamilton, Ontario L8S 4L8, Canada
Search for more papers by this authorCorresponding Author
Paul W. Ayers
Department of Chemistry and Chemical Biology, McMaster University, 1280 Main St. West, Hamilton, Ontario L8S 4L8, Canada
Department of Chemistry and Chemical Biology, McMaster University, 1280 Main St. West, Hamilton, Ontario, L8S 4L8, Canada===Search for more papers by this authorSteven K. Burger
Department of Chemistry and Chemical Biology, McMaster University, 1280 Main St. West, Hamilton, Ontario L8S 4L8, Canada
Search for more papers by this authorCorresponding Author
Paul W. Ayers
Department of Chemistry and Chemical Biology, McMaster University, 1280 Main St. West, Hamilton, Ontario L8S 4L8, Canada
Department of Chemistry and Chemical Biology, McMaster University, 1280 Main St. West, Hamilton, Ontario, L8S 4L8, Canada===Search for more papers by this authorAbstract
Recognizing the limits of trying to achieve chemical accuracy for pKa calculations with a purely electrostatic model, we include empirical corrections into the Poisson–Boltzmann solver macroscopic electrostatics with atomic detail (Bashford, Biochemistry 1990;29:10219–10225), to improve the reliability and accuracy of the model. The total number of parameters is kept to a minimum to maximize the robustness of the model for compounds outside of the fitting dataset. The parameters are based on: (a) the electrostatic interaction between functional groups close to the titratable site, (b) the electrostatic work required to desolvate the residue, and (c) the site-to-site interactions. These interactions are straightforward to calculate once the electrostatic field has been solved for each residue using the linearized Poisson–Boltzmann equation and are assumed to be linearly related to the intrinsic pKa. Two hundred and eighty-six residues from 30 proteins are used to determine the empirical parameters, which result in a root mean square error (RMSE) of 0.70 for the entire set. Eight proteins with 46 experimentally known values were excluded from the parameterization to test the model. This test set had a RMSE of 1.08. We show that the parameterized model improves the results over other models, although like other models the error is strongly correlated with the degree to which a residue is buried. The parameters themselves indicate that local effects are most important for determining the pKa, whereas site-to-site interactions are found to be less significant. Proteins 2011; © 2011 Wiley-Liss, Inc.
REFERENCES
- 1
Lamotte-Brasseur J,Dubus A,Wade RC.
pK(a) calculations for class C beta-lactamases: the role of Tyr-150.
Proteins: Struct Funct Genet
2000;
40:
23–28.
10.1002/(SICI)1097-0134(20000701)40:1<23::AID-PROT40>3.0.CO;2-7 CAS PubMed Web of Science® Google Scholar
- 2 Celik L,Lund JDD,Schiott B. Conformational dynamics of the estrogen receptor alpha: molecular dynamics simulations of the influence of binding site structure on protein dynamics. Biochemistry 2007; 46: 1743–1758.
- 3 Lee AC,Crippen GM. Predicting pKa. J Chem Inf Model 2009; 49: 2013–2033.
- 4 Davies MN,Toseland CP,Moss DS,Flower DR. Benchmarking pKa prediction. BMC Biochemistry 2006; 7: 18.
- 5 Stanton CL,Houk KN. Benchmarking pK(a) prediction methods for residues in proteins. J Chem Theor Comput 2008; 4: 951–966.
- 6 Hajjar E,Dejaegere A,Reuter N. Challenges in pK(a) Predictions for proteins: the case of Asp213 in human proteinase 3. J Phys Chem A 2009; 113: 11783–11792.
- 7 Cohen AJ,Mori-Sanchez P,Yang WT. Insights into current limitations of density functional theory. Science 2008; 321: 792–794.
- 8 Edsall JTI. Proteins, amino acids and peptides as ions and dipolar ions, 1st ed. New York: Reinhold Publishing Corp; 1943. p 444–505.
- 9 Keim P,Vigna RA,Morrow JS,Marshall RC,Gurd FRN. Carbon 13 nuclear magnetic resonance of pentapeptides of glycine containing central residues of serine, threonine, aspartic and glutamic acids, asparagine, and glutamine. J Biol Chem 1973; 248: 7811–7818.
- 10 Thurlkill RL,Grimsley GR,Scholtz JM,Pace CN. pK values of the ionizable groups in proteins. Protein Sci 2006; 15: 1214–1218.
- 11 He Y,Xu J,Pan XM. A statistical approach to the prediction of pKa values in proteins. Proteins: Struct Funct Bioinf 2007; 69: 75–82.
- 12 Hammett LP. The effect of structure upon the reactions of organic compounds. benzene derivatives. J Am Chem Soc 1937; 59: 96–103.
- 13 Li H,Robertson AD,Jensen JH. Very fast empirical prediction and rationalization of protein pK(a) values. Proteins: Struct Funct Bioinf 2005; 61: 704–721.
- 14 Li H,Robertson AD,Jensen JH. The determinants of carboxyl pKa values in turkey ovomucoid third domain. Proteins 2004; 55: 689–704.
- 15The PROPKA Web Interface. Available at: http://propka.ki.ku.dk/. Accessed December, 2010.
- 16 Huang RB,Du QS,Wang CH,Liao SM,Chou KC. A fast and accurate method for predicting pKa of residues in proteins. PEDS 2010; 23: 35–42.
- 17 Li H,Robertson AD,Jensen JH. The determinants of carboxyl pK(a) values in Turkey ovomucoid third domain. Proteins: Struct Funct Bioinf 2010; 55: 689–704.
- 18 Hori T,Takahashi H,Furukawa SI,Nakano M,Yang WT. Computational study on the relative acidity of acetic acid by the QM/MM method combined with the theory of energy representation. J Phys Chem B 2010; 111: 581–588.
- 19 Jensen JH,Li H,Robertson AD,Molina PA. Prediction and rationalization of protein pK(a) values using QM and QM/MM methods. J Phys Chem A 2005; 109: 6634–6643.
- 20 Kuhn B,Kollman PA,Stahl M. Prediction of pK(a) shifts in proteins using a combination of molecular mechanical and continuum solvent calculations. J Comput Chem 2010; 25: 1865–1872.
- 21 Wang J,Luo R. Assessment of linear finite-difference Poisson-Boltzmann solvers. J Comput Chem 2010; 31: 1689–1698.
- 22 Sharp KA,Honig B. Calculating total electrostatic energies with the nonlinear Poisson-Boltzmann Equation. J Phys Chem 1990; 94: 7684–7692.
- 23 Madura JD,Briggs JM,Wade RC,Davis ME,Luty BA,Ilin A,Antosiewicz J,Gilson MK,Bagheri B,Scott LR,McCammon JA. Electrostatics and diffusion of molecules in solution—simulations with the University-of-Houston Brownian dynamics program. Comput Phys Commun 1995; 91: 57–95.
- 24 Davis ME,Madura JD,Luty BA,McCammon JA. Electrostatics and diffusion of molecules in solution—simulations with the University-of-Houston-Brownian dynamics program. Comput Phys Commun 1991; 62: 187–197.
- 25 Bashford D,Karplus M. Multiple-site titration curves of proteins: an analysis of exact and approximate methods for their calculation. J Phys Chem 1991; 95: 9556–9561.
- 26 Bashford DKM. pKa's of ionizable groups in proteins: atomic detail from a continuum electrostatic model. Biochemistry 1990; 29: 10219–10225.
- 27
Bashford D.
An object-oriented programming suite for electrostatic effects in biological molecules.
In Scientific Computing in Object-Oriented Parallel Environments;
Y Ishikawa, RR Oldehoeft, JVW Reynders, M Tholburn, Eds.;
Springer: Berlin,
1997; Vol.
1343,
pp 233–240.
10.1007/3-540-63827-X_66 Google Scholar
- 28 Holst MJ,Saied F. Numerical-solution of the nonlinear Poisson-Boltzmann equation—developing more robust and efficient methods. J Comput Chem 1995; 16: 337–364.
- 29 Rocchia W,Alexov E,Honig B. Extending the applicability of the nonlinear Poisson-Boltzmann equation: multiple dielectric constants and multivalent ions. J Phys Chem B 2001; 105: 6507–6514.
- 30 Gordon JC,Myers JB,Folta T,Shoja V,Heath LS,Onufriev A. H++: a server for estimating pK(a)s and adding missing hydrogens to macromolecules. Nucleic Acids Res 2005; 33: W368–W371.
- 31 Georgescu RE,Alexov EG,Gunner MR. Combining conformational flexibility and continuum electrostatics for calculating pK(a)s in proteins. Biophys J 2002; 83: 1731–1748.
- 32 Kieseritzky G,Knapp EW. Optimizing pK(A) computation in proteins with pH adapted conformations. Proteins: Struct Funct Bioinf 2008; 71: 1335–1348.
- 33 Nakamura H,Sakamoto T,Wada A. A Theoretical-study of the dielectric-constant of protein. Protein Eng 1988; 2: 177–183.
- 34 Antosiewicz J,McCammon JA,Gilson MK. Prediction of pH-dependent properties of proteins. J Mol Biol 1994; 238: 415–436.
- 35 Chimenti MS,Castañeda CA,Majumdar A,García-Moreno EB. Structural origins of high apparent dielectric constants experienced by ionizable groups in the hydrophobic core of a protein. J Mol Biol 2011; 405: 361–377.
- 36 Karp DA,Stahley MR,Garcia-Moreno EB. Conformational consequences of ionization of Lys, Asp, and Glu buried at position 66 in Staphylococcal nuclease. Biochemistry 2010; 49; 4138–4146.
- 37 Castaneda CA,Fitch CA,Majumdar A,Khangulov V,Schlessman JL,Garcia-Moreno BE. Molecular determinants of the pK(a) values of Asp and Glu residues in staphylococcal nuclease. Proteins: Struct Funct Bioinf 2009; 77: 570–588.
- 38 Demchuk E,Wade RC. Improving the continuum dielectric approach to calculating pKs for ionizable groups in proteins. J Phys Chem 1996; 100: 17373–17387.
- 39 Wisz MS,Hellinga HW. An empirical model for electrostatic interactions in proteins incorporating multiple geometry-dependent dielectric constants. Proteins: Struct Funct Genet 2003; 51: 360–377.
- 40 Nozaki YTC. Examination of titration behavior. In: CHW Hirs, editors. Methods in enzymology. Academic Press: New York; 1967. Vol. 11, p 715–734.
- 41 Rocchia W,Sridharan S,Nicholls A,Alexov E,Chiabrera A,Honig B. Rapid grid-based construction of the molecular surface and the use of induced surface charge to calculate reaction field energies: applications to the molecular systems and geometric objects. J Comput Chem 2002; 23: 128–137.
- 42 Karshikoff A. A Simple algorithm for the calculation of multiple-site titration curves. Protein Eng 1995; 8: 243–248.
- 43 Tanford C,Roxby R. Biochemistry 1972; 11: 2192–2198.
- 44 Davis IW,Leaver-Fay A,Chen VB,Block JN,Kapral GJ,Wang X,Murray LW,Arendall WB,Snoeyink J,Richardson JS,Richardson DC. MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res; 35 (Web Server issue): W375–W383.
- 45 MacKerell AD,Bashford D,Bellott M,Dunbrack RL,Evanseck JD,Field MJ,Fischer S,Gao J,Guo H,Ha S,Joseph-McCarthy D,Kuchnir L,Kuczera K,Lau FTK,Mattos C,Michnick S,Ngo T,Nguyen DT,Prodhom B,Reiher WE,Roux B,Schlenkrich M,Smith JC,Stote R,Straub J,Watanabe M,Wiorkiewicz-Kuczera J,Yin D,Karplus M. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 1998; 102; 3586–3616.
- 46 Toseland CP,McSparron H,Davies MN,Flower DR. PPD v1.0—an integrated, web-accessible database of experimentally determined protein pK(a) values. Nucleic Acids Res 2006; 34: D199–D203.
- 47Protein pKa Database. Available at: http://www.darrenflower.info/PPD/pKahomepage.htm. Accessed December, 2010.
- 48Ayers Group Software. Available at: http://www.chemistry.mcmaster.ca/ayers/projects.html. Accessed January, 2010.
- 49 Fraczkiewicz R,Braun W. Exact and efficient analytical calculation of the accessible surface areas and their gradients for macromolecules. J Comp Chem 1998; 19: 319–333.
- 50 Alexov E. Role of the protein side-chain fluctuations on the strength of pair-wise electrostatic interactions: comparing experimental with computed pK(a)s. Proteins: Struct Funct Bioinf 2003; 50: 94–103.
- 51 Holmgren A. Thioredoxin and glutaredoxin systems. J Biol Chem 1989; 264: 13963–13966.