Protein secondary structure assignment through Voronoï tessellation
Franck Dupuis
Laboratoire de Minéralogie Cristallographie Paris, CNRS UMR 7590, Universités Paris 6 et 7, Paris, France
Search for more papers by this authorJean-François Sadoc
Laboratoire de Physique des Solides, CNRS UMR 8502, Université Paris 11, Orsay, France
Search for more papers by this authorCorresponding Author
Jean-Paul Mornon
Laboratoire de Minéralogie Cristallographie Paris, CNRS UMR 7590, Universités Paris 6 et 7, Paris, France
Laboratoire de Minéralogie Cristallographie, Universités Paris 6 et 7, case 115, 4 place Jussieu, 75252, Paris, France===Search for more papers by this authorFranck Dupuis
Laboratoire de Minéralogie Cristallographie Paris, CNRS UMR 7590, Universités Paris 6 et 7, Paris, France
Search for more papers by this authorJean-François Sadoc
Laboratoire de Physique des Solides, CNRS UMR 8502, Université Paris 11, Orsay, France
Search for more papers by this authorCorresponding Author
Jean-Paul Mornon
Laboratoire de Minéralogie Cristallographie Paris, CNRS UMR 7590, Universités Paris 6 et 7, Paris, France
Laboratoire de Minéralogie Cristallographie, Universités Paris 6 et 7, case 115, 4 place Jussieu, 75252, Paris, France===Search for more papers by this authorAbstract
We present a new automatic algorithm, named VoTAP (Voronoï Tessellation Assignment Procedure), which assigns secondary structures of a polypeptide chain using the list of α-carbon coordinates. This program uses three-dimensional Voronoï tessellation. This geometrical tool associates with each amino acid a Voronoï polyhedron, the faces of which unambiguously define contacts between residues. Thanks to the face area, for the contacts close together along the primary structure (low-order contacts) a distinction is made between strong and normal ones. This new definition yields new contact matrices, which are analyzed and used to assign secondary structures. This assignment is performed in two stages. The first one uses contacts between residues close together along the primary structure and is based on data collected on a bank of 282 well-refined nonredundant structures. In this bank, associations were made between the prints defined by these low-order contacts and the assignments performed by different automatic methods. The second step focuses on the strand assignment and uses contacts between distant residues. Comparison with several other automatic assignment methods are presented, and the influence of resolution on the assignment is investigated. Proteins 2004. © 2004 Wiley-Liss, Inc.
REFERENCES
- 1 Pauling L, Corey RB, Branson HR. The structure of proteins: two hydrogen-bonded helical configurations of the polypetptide chain. Proc Natl Acad Sci USA 1951; 37: 205–234.
- 2 Pauling L, Corey RB. Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets. Proc Natl Acad Sci USA 1951; 37: 729–740.
- 3 Lewis PN, Momany FA, Scheraga HA. Folding of polypeptide chains in proteins: a proposed mechanism for folding. Proc Natl Acad Sci USA 1971; 68: 2293–2297.
- 4 Kuntz ID. Protein folding. J Am Chem Soc 1972; 94: 4009–4012.
- 5 Crawford JL, Lipscomb WN, Schellman CG. The reverse turn as a polypeptide conformation in globular proteins. Proc Natl Acad Sci USA 1973; 70: 538–542.
- 6 Levitt M, Greer J. Automatic identification of secondary structure in globular proteins. J Mol Biol 1977; 114: 181–239.
- 7 Rose GD, Seltzer JP. A new algorithm for finding the peptide chain turns in a globular protein. J Mol Biol 1977; 113: 153–164.
- 8 Chou PY, Fasman GD. Beta-turns in proteins. J Mol Biol 1977; 115: 135–175.
- 9 Kolaskar AS, Ramabrahmam V, Soman KV. Reversals of polypeptide chain in globular proteins. Int J Pept Protein Res 1980; 16: 1–11.
- 10 Ramakrishnan C, Soman KV. Identification of secondary structures in globular proteins—a new algorithm. Int J Pept Protein Res 1982; 20: 218–237.
- 11 Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983; 22: 2577–2637.
- 12 Richards FM, Kundrot CE. Identification of structural motifs from protein coordinate data: secondary structure and first-level supersecondary structure. Proteins 1988; 3: 71–84.
- 13 Sklenar H, Etchebest C, Lavery R. Describing protein structure: a general algorithm yielding complete helicoidal parameters and a unique overall axis. Proteins 1989; 6: 46–60.
- 14 Frishman D, Argos P. Knowledge-based protein secondary structure assignment. Proteins 1995; 23: 566–579.
- 15 Colloc'h N, Etchebest C, Thoreau E, Henrissat B, Mornon JP. Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignment. Protein Eng 1993; 6: 377–382.
- 16 Labesse G, Colloc'h N, Pothier J, Mornon JP. P-SEA: a new efficient assignment of secondary structure from C α trace of proteins. Comput Appl Biosci 1997; 13: 291–295.
- 17 Voronoï G. Recherches sur les paralleloedres primitifs. J Reine Angew Math 1908; 134: 198–287.
- 18 Finney JL. Volume occupation, environment and accessibility in proteins. The problem of the protein surface. J Mol Biol 1975; 96: 721–732.
- 19 Tsai J, Voss N, Gerstein M. Determining the minimum number of types necessary to represent the sizes of protein atoms. Bioinformatics 2001; 17: 949–956.
- 20 Quillin ML, Matthews BW. Accurate calculation of the density of proteins. Acta Crystallogr D Biol Crystallogr 2000; 56: 791–794.
- 21 Lo Conte L, Chothia C, Janin J. The atomic structure of protein-protein recognition sites. J Mol Biol 1999; 285: 2177–2198.
- 22 Gerstein M, Chothia C. Packing at the protein-water interface. Proc Natl Acad Sci USA 1996; 93: 10167–10172.
- 23 Richards FM. The interpretation of protein structures: total volume, group volume distributions and packing density. J Mol Biol 1974; 82: 1–14.
- 24 Zimmer R, Wohler M, Thiele R. New scoring schemes for protein fold recognition based on Voronoi contacts. Bioinformatics 1998; 14: 295–308.
- 25 Zheng W, Cho SJ, Vaisman, II, Tropsha A. A new approach to protein fold recognition based on Delaunay tessellation of protein structure. Pac Symp Biocomput 1997; 486–497.
- 26 Munson PJ, Singh RK. Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence-structure alignment. Protein Sci 1997; 6: 1467–1481.
- 27 Singh RK, Tropsha A, Vaisman, II. Delaunay tessellation of proteins: four body nearest-neighbor propensities of amino acid residues. J Comput Biol 1996; 3: 213–221.
- 28 Wako H, Yamato T. Novel method to detect a motif of local structures in different protein conformations. Protein Eng 1998; 11: 981–990.
- 29 Angelov B, Sadoc JF, Jullien R, Soyer A, Mornon JP, Chomilier J. Nonatomic solvent-driven Voronoi tessellation of proteins: an open tool to analyze protein folds. Proteins 2002; 49: 446–456.
- 30 Soyer A, Chomilier J, Mornon JP, Jullien R, Sadoc JF. Voronoi tessellation reveals the condensed matter character of folded proteins. Phys Rev Lett 2000; 85: 3532–3535.
- 31 Phillips DC. The development of crystallographic enzymology. Biochem Soc Symp 1970; 30: 11–28.
- 32 Nishikawa K, Ooi T. Comparison of homologous tertiary structures of proteins. J Theor Biol 1974; 43: 351–374.
- 33 Singer MS, Vriend G, Bywater RP. Prediction of protein residue contacts with a PDB-derived likelihood matrix. Protein Eng 2002; 15: 721–725.
- 34 Selbig J. Contact pattern-induced pair potentials for protein fold recognition. Protein Eng 1995; 8: 339–351.
- 35 Go M. Correlation of DNA exonic regions with protein structural units in haemoglobin. Nature 1981; 291: 90–92.
- 36 Galaktionov S, Nikiforovich GV, Marshall GR. Ab initio modeling of small, medium, and large loops in proteins. Biopolymers 2001; 60: 153–168.
- 37 Kim MK, Jernigan RL, Chirikjian GS. Efficient generation of feasible pathways for protein conformational transitions. Biophys J 2002; 83: 1620–1630.
- 38 Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol 1993; 233: 123–138.
- 39 Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res 2000; 28: 235–242.
- 40 Stec B, Rao U, Teeter MM. Refinement of purothionins reveals solute particles important for lattice formation and toxicity. Part 2. Structure of beta-purothionin at 1.7 Å resolution. Acta Crystallogr D Biol Crystallogr 1995; 51: 914–924.
- 41 Callebaut I, Labesse G, Durand P, Poupon A, Canard L, Chomilier J, Henrissat B, Mornon JP. Deciphering protein sequence information through hydrophobic cluster analysis (HCA): current status and perspectives. Cell Mol Life Sci 1997; 53: 621–645.
- 42 Murzin AG, Brenner SE, Hubbard T, Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995; 247: 536–540.
- 43 Imada K, Inagaki K, Matsunami H, Kawaguchi H, Tanaka H, Tanaka N, Namba K. Structure of 3-isopropylmalate dehydrogenase in complex with 3- isopropylmalate at 2.0 A resolution: the role of Glu88 in the unique substrate-recognition mechanism. Structure 1998; 6: 971–982.
- 44 Hutchinson EG, Thornton JM. PROMOTIF—a program to identify and analyze structural motifs in proteins. Protein Sci 1996; 5: 212–220.
- 45 Parkin S, Rupp B, Hope H. Structure of bovine pancreatic trypsin inhibitor at 125 K definition of carboxyl-terminal residues Gly57 and Ala58. Acta Crystallogr D Biol Crystallogr 1996; 52: 18–29.
- 46 Colloc'h N, Cohen FE. Beta-breakers: an aperiodic secondary structure. J Mol Biol 1991; 221: 603–613.
- 47 Presnell SR, Cohen BI, Cohen FE. A segment-based approach to protein secondary structure prediction. Biochemistry 1992; 31: 983–993.
- 48 Weichsel A, Gasdaska JR, Powis G, Montfort WR. Crystal structures of reduced, oxidized, and mutated human thioredoxins: evidence for a regulatory homodimer. Structure 1996; 4: 735–751.
- 49 Durley RCE, Mathews FS. Refinement and structural analysis of bovine cytochrome B(5) at 1.5 angstrom resolution. Acta Crystallogr D Biol Crystallogr 1996; 52: 65–76.
- 50 King RD, Ouali M, Strong AT, Aly A, Elmaghraby A, Kantardzic M, Page D. Is it better to combine predictions? Protein Eng 2000; 13: 15–19.