Extracting representative structures from protein conformational ensembles
Alberto Perez
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Search for more papers by this authorArijit Roy
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Search for more papers by this authorKoushik Kasavajhala
Department of Chemistry, Stony Brook University, Stony Brook, New York
Search for more papers by this authorAmy Wagaman
Department of Mathematics and Statistics, Amherst College, Massachusetts
Search for more papers by this authorKen A. Dill
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Department of Chemistry, Stony Brook University, Stony Brook, New York
Department of Physics, Stony Brook University, Stony Brook, New York
Search for more papers by this authorCorresponding Author
Justin L. MacCallum
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Department of Chemistry, University of Calgary, Alberta, Canada
Correspondence to: Justin L. MacCallum, Department of Chemistry, University of Calgary, Alberta, Canada. E-mail: [email protected]Search for more papers by this authorAlberto Perez
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Search for more papers by this authorArijit Roy
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Search for more papers by this authorKoushik Kasavajhala
Department of Chemistry, Stony Brook University, Stony Brook, New York
Search for more papers by this authorAmy Wagaman
Department of Mathematics and Statistics, Amherst College, Massachusetts
Search for more papers by this authorKen A. Dill
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Department of Chemistry, Stony Brook University, Stony Brook, New York
Department of Physics, Stony Brook University, Stony Brook, New York
Search for more papers by this authorCorresponding Author
Justin L. MacCallum
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York
Department of Chemistry, University of Calgary, Alberta, Canada
Correspondence to: Justin L. MacCallum, Department of Chemistry, University of Calgary, Alberta, Canada. E-mail: [email protected]Search for more papers by this authorABSTRACT
A large number of methods generate conformational ensembles of biomolecules. Often one structure is selected to be representative of the whole ensemble, usually by clustering and selecting the structure closest to the center of the most populated cluster. We find that this structure is not necessarily the best representation of the cluster and present here two computationally inexpensive averaging protocols that can systematically provide better representations of the system, which can be more directly compared with structures from X-ray crystallography. In practice, systematic errors in the generated conformational ensembles appear to limit the maximum improvement of averaging methods. Proteins 2014; 82:2671–2680. © 2014 Wiley Periodicals, Inc.
Supporting Information
Additional Supporting Information may be found in the online version of this article.
Filename | Description |
---|---|
prot24633-sup-0001-suppinfo.docx1.4 MB |
Supplementary Information |
Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.
REFERENCES
- 1 McCammon JA, Gelin BR, Karplus M. Dynamics of folded proteins. Nature 1977; 267: 585–590.
- 2 Levitt M, Sharon R. Accurate simulation of protein dynamics in solution. Proc Natl Acad Sci USA 1988; 85: 7557–7561.
- 3 Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science 2011; 334: 517–520.
- 4 Wüthrich K. Protein structure determination in solution by NMR spectroscopy. J Biol Chem 1990.
- 5 Brünger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, Jiang JS, Kuszewski J, Nilges M, Pannu NS, Read RJ, Rice LM, Simonson T, Warren GL. Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr 1998; 54: 905–921.
- 6
Simons KT,
Bonneau R,
Ruczinski I,
Baker D. Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins 1999 (Suppl 3): 171–176.
10.1002/(SICI)1097-0134(1999)37:3+<171::AID-PROT21>3.0.CO;2-Z CAS PubMed Web of Science® Google Scholar
- 7 Sali A, Potterton L, Yuan F, van Vlijmen H, Karplus M. Evaluation of comparative protein modeling by MODELLER. Proteins 1885; 23: 318–326.
- 8 Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 2010; 5: 725–738.
- 9 Perez A, Yang Z, Bahar I, Dill KA, FlexE: using elastic network models to compare models of protein structure. J Chem Theory Comput 2012; 8: 3985–3991.
- 10 Nugent T, Cozzetto D, Jones DT. Evaluation of predictions in the CASP10 model refinement category. Proteins 2014; 82: 98–111.
- 11
Sullivan DC,
Kuntz ID. Conformation spaces of proteins. Proteins 2001; 42: 495–511.
10.1002/1097-0134(20010301)42:4<495::AID-PROT80>3.0.CO;2-9 CAS PubMed Web of Science® Google Scholar
- 12 Shao J, Tanner SW, Thompson N, Cheatham TE. Clustering molecular dynamics trajectories: 1. Characterizing the performance of different clustering algorithms. J Chem Theory Comput 2007; 3: 2312–2334.
- 13 Keller B, Daura X, van Gunsteren WF. Comparing geometric and kinetic cluster algorithms for molecular simulation data. J Chem Phys 2010; 132: 074110.
- 14 Newman M. The mathematics of networks. In: Blume LE, Durlauf SN, editors. The New Palgrave Encyclopedia of Economics; Palgrave Macmillan, Basingstoke; 2008.
- 15
Newman M. Networks: an introduction, Google Books, Oxford University Press, New York, NY (2010).
10.1093/acprof:oso/9780199206650.001.0001 Google Scholar
- 16 Wills RS. Google's pagerank. Math Intelligencer 2006; 28: 6–11.
- 17 Hess B, Kutzner C, van der Spoel D, Lindahl E. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J Chem Theory Comput 2008; 48: 435–447.
- 18 Hornak V, Abel R, Okur A, Strockbine B, Roitberg A, Simmerling C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins 2006; 65: 712–725.
- 19 Lindorff-Larsen K, Piana S, Palmo K, Maragakis P, Klepeis JL, Dror RO, et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 2010; 78: 1950–1958.
- 20 Onufriev A, Bashford D, Case DA. Exploring protein native states and large-scale conformational changes with a modified generalized born model. Proteins 2004; 55: 383–394.
- 21 Zhang C, Vasmatzis G, Cornette JL, DeLisi C. Determination of atomic desolvation energies from the structures of crystallized proteins. J Mol Biol 1997; 267: 707–726.
- 22 Liu DC, Nocedal J. On the limited memory BFGS method for large scale optimization. Math Program 1989; 45: 503–528.
- 23 Coutsias EA, Seok C, Dill KA. Using quaternions to calculate RMSD. J Comput Chem 2004; 25: 1849–1857.
- 24 Zemla A. LGA: a method for finding 3D similarities in protein structures. Nucleic Acids Res 2003; 31: 3370–3374.
- 25 Olechnovič K, Kulberkytė E, Venclovas C. CAD-score: a new contact area difference-based function for evaluation of protein structural models. Proteins 2013; 81; 149–162.
- 26 Davis IW, Murray LW, Richardson JS, Richardson DC. MolProbity: structure validation and all-atom contact analysis for nucleic acids and their complexes. Nucleic Acids 2004; 32: W615–W619.
- 27 Gō N, Scheraga HA. On the use of classical statistical mechanics in the treatment of polymer chain conformation. Macromolecules 1976; 9: 535–542.
- 28 Noel JK, Whitford PC, Sanbonmatsu KY, Onuchic JN. SMOG@ctbp: simplified deployment of structure-based models in GROMACS. Nucleic Acids Res 2010; 38: W657–W661.
- 29 Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. J Chem Phys 1983; 79: 926.
- 30 Case DA, Darden TA, Cheatham TE, Simmerling CL, Wang J, Duke RE, Luo R, Walker RC, Zhang W, Merz KM, Roberts B, Hayik S, Roitberg A, Seabra G, Swails J, Goetz AW, Kolossváry I, Wong KF, Paesani F, Vanicek J, Wolf RM, Liu J, Wu X, Brozell SR, Steinbrecher T, Gohlke H, Cai Q, Ye X, Wang J, Hsieh MJ, Cui G, Roe DR, Mathews DH, Seetin MG, Salomon-Ferrer R, Sagui C, Babin V, Luchko T, Gusarov S, Kovalenko A, Kollman PA. Amber12. University of California San Francisco; 2012.
- 31 Darden T, York D, Pedersen L. Particle mesh Ewald: an N⋅log(N) method for Ewald sums in large systems. J Chem Phys 1993; 98: 10089–10092.
- 32 Mirjalili V, Feig M. Protein structure refinement through structure selection and averaging from molecular dynamics ensembles. J Chem Theory Comput 2013; 9: 1294–1303.
- 33 Moult J, Fidelis K, Kryshtafovych A, Schwede T, Tramontano A. Critical assessment of methods of protein structure prediction (CASP)—round X. Proteins 2014; 82(Suppl 2): 1–6.
- 34 Yang Y, Zhou Y. Ab initio folding of terminal segments with secondary structures reveals the fine difference between two closely related all-atom statistical energy functions. Protein Sci 2008; 17: 1212–1219.
- 35 Yang Y, Zhou Y. Specific interactions for ab initio folding of protein terminal regions with secondary structures. Proteins 2008; 72: 793–803.