Application of nonlinear dimensionality reduction to characterize the conformational landscape of small peptides
Hernán Stamati
Department of Computer Science, Rice University, Houston, Texas 77005
Search for more papers by this authorCorresponding Author
Cecilia Clementi
Department of Chemistry, Rice University, Houston, Texas 77005
Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas 77030
Cecilia Clementi, Department of Chemistry, Rice University, 6100 Main Street, MS-60, Houston, TX 77005===
Lydia E. Kavraki, Department of Computer Science, Rice University, 6100 Main Street, MS-132, Houston, TX 77005===
Search for more papers by this authorCorresponding Author
Lydia E. Kavraki
Department of Computer Science, Rice University, Houston, Texas 77005
Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas 77030
Department of Bioengineering, Rice University, Houston, Texas 77005
Cecilia Clementi, Department of Chemistry, Rice University, 6100 Main Street, MS-60, Houston, TX 77005===
Lydia E. Kavraki, Department of Computer Science, Rice University, 6100 Main Street, MS-132, Houston, TX 77005===
Search for more papers by this authorHernán Stamati
Department of Computer Science, Rice University, Houston, Texas 77005
Search for more papers by this authorCorresponding Author
Cecilia Clementi
Department of Chemistry, Rice University, Houston, Texas 77005
Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas 77030
Cecilia Clementi, Department of Chemistry, Rice University, 6100 Main Street, MS-60, Houston, TX 77005===
Lydia E. Kavraki, Department of Computer Science, Rice University, 6100 Main Street, MS-132, Houston, TX 77005===
Search for more papers by this authorCorresponding Author
Lydia E. Kavraki
Department of Computer Science, Rice University, Houston, Texas 77005
Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas 77030
Department of Bioengineering, Rice University, Houston, Texas 77005
Cecilia Clementi, Department of Chemistry, Rice University, 6100 Main Street, MS-60, Houston, TX 77005===
Lydia E. Kavraki, Department of Computer Science, Rice University, 6100 Main Street, MS-132, Houston, TX 77005===
Search for more papers by this authorAbstract
The automatic classification of the wealth of molecular configurations gathered in simulation in the form of a few coordinates that help to explain the main states and transitions of the system is a recurring problem in computational molecular biophysics. We use the recently proposed ScIMAP algorithm to automatically extract motion parameters from simulation data. The procedure uses only molecular shape similarity and topology information inferred directly from the simulated conformations, and is not biased by a priori known information. The automatically recovered coordinates prove as excellent reaction coordinates for the molecules studied and can be used to identify stable states and transitions, and as a basis to build free-energy surfaces. The coordinates provide a better description of the free energy landscape when compared with coordinates computed using principal components analysis, the most popular linear dimensionality reduction technique. The method is first validated on the analysis of the dynamics of an all-atom model of alanine dipeptide, where it successfully recover all previously known metastable states. When applied to characterize the simulated folding of a coarse-grained model of β-hairpin, in addition to the folded and unfolded states, two symmetric misfolding crossings of the hairpin strands are observed, together with the most likely transitions from one to the other. Proteins 2010. © 2009 Wiley-Liss, Inc.
REFERENCES
- 1 Hansson T,Oostenbrink C,van Gunsteren WF. Molecular dynamics simulations. Curr Opin Struct Biol 2002; 12: 190–196.
- 2 Tai K. Conformational sampling for the impatient. Biophys Chem 2004; 107: 213–220.
- 3 Karplus M,Kuriyan J. Molecular dynamics and protein function. Proc Natl Acad Sci 2005; 102: 6679–6685.
- 4 Liwo A,Khalili M,Scheraga HA. Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptied chains. Proc Natl Acad Sci USA 2005; 102: 2362–2367.
- 5 Izvekov S,Voth GA. A multiscale coarse-graining method for biomolecular systems. J Phys Chem Lett B 2005; 109: 2469–2473.
- 6 Chu JW,Ayton GS,Izvekov S,Voth GA. Emerging methods for multiscale simulation of biomolecular systems. Mol Phys 2007; 105: 167–175.
- 7 Tama F,Charles LBrooks I. Symmetry, form, and shape: guiding principles for robustness in macromolecular machines. Ann Rev Biophy Biomol Struct 2006; 35: 115–133.
- 8 Tozzini V,Trylska J,En Chang C,McCammon JA. Flap opening dynamics in hiv-1 protease explored with a coarse-grained model. J Struct Biol 2007; 157: 606–615.
- 9 Doruker P,Jernigan RL,Bahar I. Dynamics of large proteins through hierarchical levels of coarse-grained structures. J Comput Chem 2002; 23: 119–127.
- 10 Kevrekidis IG, Gear CW, Hummer G. Equation-free: The computer-aided analysis of complex multiscale systems. AIChE J 2004; 50: 1346–1355.
- 11 Clementi C,Nymeyer H,Onuchic J. Topological and energetic factors: what determines the structural details of the transition state ensemble and en-route intermediates for protein folding? An investigation for small globular proteins. J Mol Biol 2000; 298: 937–953.
- 12 Nielsen SO,Lopez CF,Srinivas G,Klein ML. Coarse grain models and the computer simulation of soft materials. J Phys Condens Matter 2004; 16: 481–512.
- 13 Das P,Matysiak S,Clementi C. Balancing energy and entropy: a minimalist model for the characterization of protein folding landscapes. Proc Natl Acad Sci USA 2005; 102: 10141–10146.
- 14 Levy Y,Cho SS,Onuchic JN,Wolynes PG. A survey of flexible protein binding mechanisms and their transition states using native topology based energy landscapes. J Mol Biol 2005; 346: 1121–1145.
- 15 Praprotnik M,Matysiak S,Delle Site L,Kremer K,Clementi C. Adaptive resolution simulation of liquid water. J Phys: Condens Matter 2007; 19: 292201.
- 16 Head-Gordon T,Brown S. Minimalist models for protein folding and design. Curr Opin Struct Biol 2003; 13: 160–167.
- 17 Karanicolas J,Brooks C,III. Improved Go-like models demonstrate the robustness of protein folding mechanisms towards non-native interactions. J Mol Biol 2003; 334: 309–325.
- 18 Ayton G,Noid W,Voth G. Multiscale modeling of biomolecular systems: in serial and in parallel. Curr Opin Struct Biol 2007; 17: 192–198.
- 19 Christen M,van Gunsteren W. Multigraining: an algorithm for simultaneous fine-grained and coarse-grained simulation of molecular systems. J Chem Phys 2006; 124: 154106.
- 20 Dokholyan NV. Studies of folding and misfolding using simplified models. Curr Opin Struct Biol 2006; 16: 79–85.
- 21 De Mori G,Colombo G,Micheletti C. Study of the villin headpiece folding dynamics by combining coarse-grained monte carlo evolution and all-atom molecular dynamics. Proteins 2005; 58: 459–471.
- 22 Lyman E,Ytreberg F,Zuckerman D. Resolution exchange simulation. Phys Rev Lett 2006; 96: 028105.
- 23 Das P,Wilson C,Fossati G,Wittung-Stafshede P,Matthews K,Clementi C. Characterization of the folding landscape of monomeric lactose repressor: quantitative comparison of theory and experiment. Proc Natl Acad Sci USA 2005; 102: 14569–14574.
- 24 Matysiak S,Clementi C. Minimalist protein model as a diagnostic tool for misfolding and aggregation. J Mol Biol 2006; 363: 297–308.
- 25 Marrink S,Risselada H,Yefimov S,Tieleman D,de Vries A. The MARTINI forcefield: coarse grained model for biomolecular simulations. J Phys Chem B 2007; 111: 7812–7824.
- 26 Clementi C. Coarse-grained models of protein folding: toy-models or predictive tools? Curr Opin Struct Biol 2008; 18: 10–15.
- 27 Kwak W,Hansmann UH. Efficient sampling of protein structures by model hopping. Phys Rev Lett 2005; 95: 138102.
- 28 Neri M,Anselmi C,Cascella M,Maritan A,Carloni P. Coarse-grained model of proteins incorporating atomistic detail of the active site. Phys Rev Lett 2005; 95: 218102.
- 29 Praprotnik M,Delle Site L,Kremer K. Adaptive resolution molecular-dynamics simulation: changing the degrees of freedom on the fly. J Chem Phys 2005; 123: 224106.
- 30 Olender R,Elber R. Calculation of classical trajectories with a very large time step: formalism and numerical examples. J Chem Phys 1996; 105: 9299–9315.
- 31 Xu L,Henkelman G. Adaptive kinetic monte carlo for first-principles accelerated dynamics. J Chem Phys 2008; 129: 114104.
- 32 Yang LJ,Gao YQ. An approximate method in using molecular mechanics simulations to study slow protein conformational changes. J Phys Chem B 2007; 111: 2969–2975.
- 33 Henkelman G,Jonsson H. A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives. J Chem Phys 1999; 111: 7010–7022.
- 34 Andricioaei I,Straub JE,Voter AF. Smart darting monte carlo. J Chem Phys 2001; 114: 6994–7000.
- 35 Shim Y,Amar JG,Uberuaga BP,Voter AF. Reaching extended length scales and time scales in atomistic simulations via spatially parallel temperature-accelerated dynamics. Phys Rev B 2007; 76: 205439.
- 36 Rensen MRS,Voter AF. Temperature-accelerated dynamics for simulation of infrequent events. J Chem Phys 2000; 112: 9599–9606.
- 37 Marinari E,Parisi G. Simulated tempering—a new monte-carlo scheme. Europhys Lett 1992; 19: 451–458.
- 38 Hansmann U,Okamoto Y. Generalized-ensemble monte carlo method for systems with rough energy landscape. Phys Rev E 1997; 56: 2228–2233.
- 39 Whitfield TW,Bu L,Straub JE. Generalized parallel sampling. Phys A: Stati Mech Appl 2002; 305: 157–171.
- 40 Kim J,Keyes T,Straub JE. Replica exchange statistical temperature monte carlo. J Chem Phys 2009; 130: 124112.
- 41 Chen JH,Brooks CL,Khandogin J. Recent advances in implicit solvent-based methods for biomolecular simulations. Curr Opinion Struct Biol 2008; 18: 140–148.
- 42 Jayachandran G,Vishal V,Pande VS. Using massively parallel simulation and markovian models to study protein folding: Examining the dynamics of the villin headpiece. J Chem Phys 2006; 124: 164902.
- 43 Baumketner A,Shea JE,Hiwatari Y. Improved theoretical description of protein folding kinetics from rotations in the phase space of relevant order parameters. J Chem Phys 2004; 121: 1114–1120.
- 44 Cho S,Levy Y,Wolynes P. P versus Q: structural reaction coordinates capture protein folding on smooth landscapes. Proc Natl Acad Sci USA 2006; 103: 586–591.
- 45 Elber R. Long-timescale simulation methods. Curr Opin Struct Biol 2005; 15: 151–156.
- 46 Metzner P,Schuette C,Vanden-Eijnden E. Illustration of transition path theory on a collection of simple examples. J Chem Phys 2006; 125: 084110.
- 47 E W,Vanden-Eijnden E. Towards a theory of transition paths. J Stat Phys 2006; 123: 503–523.
- 48 Bolhuis P,Dellago C,Chandler D. Reaction coordinates of biomolecular isomerization. Proc Natl Acad Sci USA 2000; 97: 5877–5882.
- 49 Bolhuis P,Chandler D,Dellago C,Geissler P. Transition path sampling: throwing ropes over rough mountain passes, in the dark. Annu Rev Phys Chem 2002; 53: 291–318.
- 50 Dellago C,Bolhuis P,Geissler P. Transition path sampling. Adv Chem Phys 2002; 123: 1–78.
- 51 Dellago C,Bolhuis PG. Transition path sampling simulations of biological systems. Top Curr Chem 2007; 268: 291–317.
- 52 Voter AF,Doll JD. Dynamical corrections to transition state theory for multistate systems: surface self-diffusion in the rare-event regime. J Chem Phys 1985; 82: 80–92.
- 53 Hinrichs NS,Pande VS. Calculation of the distribution of eigenvalues and eigenvectors in markovian state models for molecular dynamics. J Chem Phys 2007; 126: 244101.
- 54 Singhal N,Snow CD,Pande VS. Using path sampling to build better markovian state models: predicting the folding rate and mechanism of a tryptophan zipper beta hairpin. J Chem Phys 2004; 121: 415–425.
- 55 Elber R. A milestoning study of the kinetics of an allosteric transition: atomically detailed simulations of deoxy scapharca hemoglobin. Biophys J 2007; 92: L85–L87.
- 56 West AMA,Elber R,Shalloway D. Extending molecular dynamics time scales with milestoning: example of complex kinetics in a solvated peptide. J Chem Phys 2007; 126: 145104.
- 57 Vanden-Eijnden E,Venturoli M,Ciccotti G,Elber R. On the assumptions underlying milestoning. J Chem Phys 2008; 129: 174102.
- 58 Henkelman G,Uberuaga BP,Jónsson H. A climbing image nudged elastic band method for finding saddle points and minimum energy paths. J Chem Phys 2000; 113: 9901–9904.
- 59 Sheppard D,Terrell R,Henkelman G. Optimization methods for finding minimum energy paths. J Chem Phys 2008; 128: 134106.
- 60 E W,Ren W,Vanden-Eijnden E. String method for the study of rare events. Phys Rev B 2002; 66: 052301.
- 61 Maragliano L,Vanden-Eijnden E. On-the-fly string method for minimum free energy paths calculation. Chem Phys Lett 2007; 446: 182–190.
- 62 Miller TF,Vanden-Eijnden E,Chandler D. Solvent coarse-graining and the string method applied to the hydrophobic collapse of a hydrated chain. Proc Natl Acad Sci USA 2007; 104: 14559–14564.
- 63 Ma A,Dinner AR. Automatic method for identifying reaction coordinates in complex systems. J Phys Chem B 2005; 109: 6769–6779.
- 64 Parida L,Zhou R. Combinatorial pattern discovery approach for the folding trajectory analysis of a β-hairpin. PLoS Comput Biol 2005; 1: 32–40.
- 65 Das P,Moll M,Stamati H,Kavraki LE,Clementi C. Low-dimensional free energy landscapes of protein folding reactions by nonlinear dimensionality reduction. Proc Natl Acad Sci USA 2006; 103: 9885–9890.
- 66 Carreira-Perpinan MA. Dimensionality reduction, 1st ed. Chapman & Hall/CRC; 2010, 320 p.
- 67 Lee JA,Verleysen M. Nonlinear dimensionality reduction. Springer, Information Science and Statistics series; 2007, 310 pp.
- 68 Benito M,Pena D. Dimensionality reduction with image data. Lect Notes Comput Sci 2004; 3177: 326–332.
- 69 Cho E,Kim D,Lee S. Posed face image synthesis using nonlinear manifold learning. Lect Notes Comput Sci 2003; 2688: 946–954.
- 70 Kirby M,Sirovich L. Application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Trans Pattern Anal Mach Intell 1990; 12: 103–108.
- 71 Turk M,Pentland A. Face recognition using eigenfaces. In: Proceedings of the IEEE conference in computer vision and pattern recognition. Maui, HI; 1991. pp 586–591.
- 72 Weinberger KQ,Saul LK. Unsupervised learning of image manifolds by semidefinite programming. In: Proceedings of the IEEE conference in computer vision and pattern recognition. Washington, DC; 2004. pp 988–995.
- 73 Teodoro M,Phillips G,Jr,Kavraki L. Understanding protein flexibility through dimensionality reduction. J Comp Biol 2003; 10: 617–634.
- 74 Garcia AE. Large-amplitude nonlinear motions in proteins. Phys Rev Lett 1992; 68: 2696–2699.
- 75 Zhang Z,Wriggers W. Local feature analysis: a statistical theory for reproducible essential dynamics of large macromolecules. Proteins: Struct Funct Biol 2006; 64: 391–403.
- 76 Wu D,Su W,Carpuat M. A Kernel PCA method for superior word sense disambiguation. In: Proceedings of the 42nd annual meeting of the association for computational linguistics. Morristown, NJ: Association for Computational Linguistics; 2004. p 637.
- 77
Jolliffe I.
Principal components analysis.
New York:
Springer-Verlag;
1986.
10.1007/978-1-4757-1904-8 Google Scholar
- 78 Balsera MA,Wriggers W,Oono Y,Schulten K. Principal component analysis and long time protein dynamics. J Phys Chem 1996; 100: 2567–2572.
- 79 Scholkopf B,Smola A,Muller KR. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 1998; 10: 1299–1319.
- 80 Tenenbaum J,de Silva V,Langford J. A global geometric framework for nonlinear dimensionality reduction. Science 2000; 290: 2319–2323.
- 81 Roweis S,Saul L. Nonlinear dimensionality reduction by locally linear embedding. Science 2000; 290: 2323–2326.
- 82 Passerone D,Ceccarelli M,Parrinello M. A concerted variational strategy for investigating rare events. J Chem Phys 2003; 118: 2025–2032.
- 83 Bolhuis PG,Dellago C,Chandler D. Reaction coordinates of biomolecular isomerization. Proc Natl Acad Sci USA 2000; 97: 5877–5882.
- 84 Drozdov AN,Grossfield A,Pappu RV. The role of solvent in determining conformational preferences of alanine dipeptide in water. JAm Chem Soc 2004; 126: 2574–2581.
- 85
Cox T,Cox M.
Multidimensional scaling,
2nd ed.
Chapman & Hall;
2000.
10.1201/9780367801700 Google Scholar
- 86 de Silva V,Tenenbaum J. Global versus local methods in nonlinear dimensionality reduction. In: S Becker, S Thrun, K Obermayer, editors, Advances in neural information processing systems 15. Cambridge, MA: MIT Press; 2002. pp 705–712.
- 87 de Berg M,van Krefeld M,Overmars M,Schwarzkopf O. Computational geometry: algorithms and applications, 2nd ed. Springer-Verlag, Berlin, Heidelberg, New York; 2000. 379 p.
- 88 Plaku E,Bekris KE,Kavraki LE. OOPS for motion planning: an online open-source programming system. In: IEEE international conference on robotics and automation. Rome, Italy; 2007. pp 3711–3716.
- 89 Plaku E,Stamati H,Clementi C,Kavraki LE. Fast and reliable analysis of molecular motion using proximity relations and dimensionality reduction. Proteins: Struct Funct Bioinform 2007; 67: 897–907.
- 90 Plaku E,Kavraki LE. Quantitative analysis of nearest-neighbors search in high-dimensional sampling-based motion planning. Intl Workshop on the Algorithmic Foundations of Robotics. New York, NY 2006. Springer Tracts in Advanced Robotics, 2008; 47: 3–18.
- 91 Wang J,Wolf RM,Caldwell JW,Kollman PA,Case DA. Development and testing of a general amber force field. J Comput Chem 2004; 25: 1157–1174.
- 92 Schiffer CA,Caldwell JW,Stroud RM,Kollman PA. Inclusion of solvation free energy with molecular mechanics energy: alanyl dipeptide as a test case. Protein Sci 1992; 1: 396–400.
- 93 Hummer G,Kevrekidis IG. Coarse molecular dynamics of a peptide fragment: free energy, kinetics, and long-time dynamics computations. J Chem Phys 2003; 118: 10762–10773.
- 94 Ferrenberg A,Swendsen R. Optimized Monte Carlo data analysis. Phys Rev Lett 1989; 63: 1185–1198.
- 95 Ferrenberg A,Swendsen R. New Monte Carlo technique for studying phase transitions. Phys Rev Lett 1988; 61: 2635–2638.
- 96 Roux B. The calculation of the potential of mean force using computer simulations. Comput Phys Commun 1995; 91: 275–282.
- 97 Kumar S, Bouzida D, Swendsen RH, Kollman PA, Rosenberg JM. The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method. J Comput Chem 1992; 13: 1011–1021.
- 98 Kumar S,Rosenberg JM,Bouzida D,Swendsen RH, Kollman PA. Multidimensional free-energy calculations using the weighted histogram analysis method. J Comput Chem 1995; 16: 1339–1350.
- 99 Ciccotti G,Lelievre T,Vanden-Eijnden E. Projection of diffusions on submanifolds: application to mean force computation. Commun Pure Appl Math 2008; 61: 371–408.
- 100 Guo ZY,Thirumalai D,Honeycutt JD. Folding kinetics of proteins—a model study. J Chem Phys 1992; 97: 525–535.
- 101 Mazzoni LN,Casetti L. Geometry of the energy landscape and folding transition in a simple model of a protein. Phys Rev E 2008; 77: 051917.