Graphs: Flexible Representations of Molecular Structures and Biological Networks
Milind Misra
Advanced Device Technologies, Sandia National Laboratories, Albuquerque, New Mexico, USA
Search for more papers by this authorShawn Martin
Computer Science and Informatics, Sandia National Laboratories, Albuquerque, New Mexico, USA
Search for more papers by this authorJean-Loup Faulon
Institute of Systems & Synthetic Biology, CNRS, University of Evry, France
Search for more papers by this authorMilind Misra
Advanced Device Technologies, Sandia National Laboratories, Albuquerque, New Mexico, USA
Search for more papers by this authorShawn Martin
Computer Science and Informatics, Sandia National Laboratories, Albuquerque, New Mexico, USA
Search for more papers by this authorJean-Loup Faulon
Institute of Systems & Synthetic Biology, CNRS, University of Evry, France
Search for more papers by this authorRajarshi Guha
NIH Chemical Genomics Center, Rockville, Maryland, USA
Search for more papers by this authorAndreas Bender
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
Search for more papers by this authorSummary
This chapter contains sections titled:
-
Introduction
-
Chemical Signature: Molecular Design and QSAR/QSPR
-
Protein Signature: Prediction of Protein–Protein Interactions
-
Protein–Chemical Signature: Predicting Enzyme–Metabolite and Drug–Target Interactions
-
Conclusions
-
References
REFERENCES
- Kier, L. B.; Hall, L. H. Intermolecular accessibility: the meaning of molecular connectivity. J. Chem. Inf. Comput. Sci. 2000, 40, 792–795.
- Randic, M.; Zupan, J. On interpretation of well-known topological indices. J. Chem. Inf. Comput. Sci. 2001, 41, 550–560.
- Brown, R. D.; Martin, Y. C. The information content in 2D and 3D structural descriptors relevant to ligand–receptor binding. J. Chem. Inf. Comput. Sci. 1997, 37, 1–9.
- Randić, M.; Basak, S. C. A new descriptor for structure–property and structure–activity correlations. J. Chem. Inf. Comput. Sci. 2001, 41, 650–656.
- Faulon, J.-L. Stochastic generator of chemical structure: 1. Application to the structure elucidation of large molecules. J. Chem. Inf. Comput. Sci. 1994, 34, 1204–1218.
- Visco, D. P., Jr.; Pophale, R. S.; Rintoul, M. D.; Faulon, J.-L. Developing a methodology for an inverse quantitative structure–activity relationship using the signature molecular descriptor. J. Mol. Graph. Model. 2002, 20, 429–438.
- Faulon, J. L.; Visco, D. P., Jr.; Pophale, R. S. The signature molecular descriptor: 1. Using extended valence sequences in QSAR and QSPR studies. J. Chem. Inf. Comput. Sci. 2003, 43(3), 707–720.
- Faulon, J.-L. Isomorphism, automorphism partitioning, and canonical labeling can be solved in polynomial-time for molecular graphs. J. Chem. Inf. Comput. Sci. 1998, 38, 432–444.
- Faulon, J. L.; Collins, M. J.; Carr, R. D. The signature molecular descriptor: 4. Canonizing molecules using extended valence sequences. J. Chem. Inf. Comput. Sci. 2004, 44(2), 427–436.
- Hall, L. H. MOLCONN-Z. Hall Associates Consulting, Quincy, MA, 1991.
- Kier, L. B.; Hall, L. H. Molecular Structure Description. Academic Press, San Diego, CA, 1999.
- Draper, N. R.; Smith, H. Applied Regression Analysis, 2nd ed. Wiley, New York, 1981.
- Tong, W.; Lowis, D. R.; Perkins, R.; Chen, Y.; Welsh, W. J.; Goddette, D. W.; Heritage, T. W.; Sheehan, D. M. Evaluation of quantitative structure–activity relationship methods for large-scale prediction of chemicals binding to the estrogen receptor. J. Chem. Inf. Comput. Sci. 1998, 38(4), 669–677.
- Martin, S.; Roe, D.; Faulon, J. L. Predicting protein-protein interactions using signature products. Bioinformatics 2005, 21(2), 218–226.
- Shoemaker, B. A.; Panchenko, A. R. Deciphering protein–protein interactions: I. Experimental techniques and databases. PLoS Comput. Biol. 2007, 3(3), e42.
- Fields, S.; Song, O. A novel genetic system to detect protein–protein interactions. Nature 1989, 340(6230), 245–246.
- Ho, Y.; Gruhler, A.; Heilbut, A.; Bader, G. D.; Moore, L.; Adams, S. L.; Millar, A.; Taylor, P.; Bennett, K.; Boutilier, K.; et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415(6868), 180–183.
- Zhu, H.; Bilgin, M.; Bangham, R.; Hall, D.; Casamayor, A.; Bertone, P.; Lan, N.; Jansen, R.; Bidlingmaier, S.; Houfek, T.; et al. Global analysis of protein activities using proteome chips. Science 2001, 293(5537), 2101–2105.
- Shoemaker, B. A.; Panchenko, A. R. Deciphering protein–protein interactions: II. Computational methods to predict protein and domain interaction partners. PLoS Comput. Biol. 2007, 3(4), e43.
- Dandekar, T.; Snel, B.; Huynen, M.; Bork, P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 1998, 23(9), 324–328.
- Enright, A. J.; Iliopoulos, I.; Kyrpides, N. C.; Ouzounis, C. A. Protein interaction maps for complete genomes based on gene fusion events. Nature 1999, 402(6757), 86–90.
- Marcotte, E. M.; Pellegrini, M.; Ng, H. L.; Rice, D. W.; Yeates, T. O.; Eisenberg, D. Detecting protein function and protein–protein interactions from genome sequences. Science 1999, 285(5428), 751–753.
- Pazos, F.; Helmer-Citterich, M.; Ausiello, G.; Valencia, A. Correlated mutations contain information about protein–protein interaction. J. Mol. Biol. 1997, 271(4), 511–523.
- Goh, C. S.; Bogan, A. A.; Joachimiak, M.; Walther, D.; Cohen, F. E. Coevolution of proteins with their interaction partners. J. Mol. Biol. 2000, 299(2), 283–293.
- Kumar, A.; Agarwal, S.; Heyman, J. A.; Matson, S.; Heidtman, M.; Piccirillo, S.; Umansky, L.; Drawid, A.; Jansen, R.; Liu, Y.; et al. Subcellular localization of the yeast proteome. Genes Dev 2002, 16(6), 707–719.
- Valencia, A.; Pazos, F. Computational methods for the prediction of protein interactions. Curr. Opin. Struct. Biol. 2002, 12(3), 368–373.
- Sprinzak, E.; Margalit, H. Correlated sequence-signatures as markers of protein–protein interaction. J. Mol. Biol. 2001, 311(4), 681–692.
- Bock, J. R.; Gough, D. A. Predicting protein–protein interactions from primary structure. Bioinformatics 2001, 17(5), 455–460.
- Bock, J. R.; Gough, D. A. Whole-proteome interaction mining. Bioinformatics 2003, 19(1), 125–134.
- Noble, W. S. What is a support vector machine? Nat. Biotechnol. 2006, 24(12), 1565–1567.
- Leslie, C.; Eskin, E.; Noble, W. S. The spectrum kernel: a string kernel for SVM protein classification. Pac. Symp. Biocomput. 2002, 564–575.
- Tong, A. H.; Drees, B.; Nardelli, G.; Bader, G. D.; Brannetti, B.; Castagnoli, L.; Evangelista, M.; Ferracuti, S.; Nelson, B.; Paoluzi, S.; et al. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 2002, 295(5553), 321–324.
- Xenarios, I.; Salwinski, L.; Duan, X. J.; Higney, P.; Kim, S. M.; Eisenberg, D. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002, 30(1), 303–305.
- Rain, J. C.; Selig, L.; De Reuse, H.; Battaglia, V.; Reverdy, C.; Simon, S.; Lenzen, G.; Petel, F.; Wojcik, J.; Schachter, V.; et al. The protein–protein interaction map of Helicobacter pylori. Nature 2001, 409(6817), 211–215.
- Apweiler, R.; Attwood, T. K.; Bairoch, A.; Bateman, A.; Birney, E.; Biswas, M.; Bucher, P.; Cerutti, L.; Corpet, F.; Croning, M. D.; et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 2001, 29(1), 37–40.
- von Mering, C.; Krause, R.; Snel, B.; Cornell, M.; Oliver, S. G.; Fields, S.; Bork, P. Comparative assessment of large-scale data sets of protein–protein interactions. Nature 2002, 417(6887), 399–403.
- Sprinzak, E.; Sattath, S.; Margalit, H. How reliable are experimental protein–protein interaction data? J. Mol. Biol. 2003, 327(5), 919–923.
- Jansen, R.; Yu, H.; Greenbaum, D.; Kluger, Y.; Krogan, N. J.; Chung, S.; Emili, A.; Snyder, M.; Greenblatt, J. F.; Gerstein, M. A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 2003, 302(5644), 449–453.
- Faulon, J. L.; Misra, M.; Martin, S.; Sale, K.; Sapra, R. Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor. Bioinformatics 2008, 24(2), 225–233.
- Kanehisa, M.; Goto, S.; Hattori, M.; Aoki-Kinoshita, K. F.; Itoh, M.; Kawashima, S.; Katayama, T.; Araki, M.; Hirakawa, M. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006, 34(Database issue), D354–D357.
- Wishart, D. S.; Knox, C.; Guo, A. C.; Shrivastava, S.; Hassanali, M.; Stothard, P.; Chang, Z.; Woolsey, J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006, 34(Database issue), D668–D672.
- Austin, C. P.; Brady, L. S.; Insel, T. R.; Collins, F. S. NIH molecular libraries initiative. Science 2004, 306(5699), 1138–1139.
- Brooksbank, C.; Cameron, G.; Thornton, J. The European Bioinformatics Institute's data resources: towards systems biology. Nucleic Acids Res. 2005, 33(Database issue), D46–D53.
- Chen, X.; Ji, Z. L.; Chen, Y. Z. TTD: therapeutic target database. Nucleic Acids Res. 2002, 30(1), 412–415.
- Kellenberger, E.; Rodrigo, J.; Muller, P.; Rognan, D. Comparative evaluation of eight docking tools for docking and virtual screening accuracy. Proteins 2004, 57(2), 225–242.
- Warren, G. L.; Andrews, C. W.; Capelli, A. M.; Clarke, B.; LaLonde, J.; Lambert, M. H.; Lindvall, M.; Nevins, N.; Semus, S. F.; Senger, S.; et al. A critical assessment of docking programs and scoring functions. J. Med. Chem. 2006, 49(20), 5912–5931.
- Borgwardt, K. M.; Ong, C. S.; Schonauer, S.; Vishwanathan, S. V.; Smola, A. J.; Kriegel, H. P. Protein function prediction via graph kernels. Bioinformatics 2005, 21(Suppl. 1), i47–i56.
- Cai, C. Z.; Han, L. Y.; Ji, Z. L.; Chen, X.; Chen, Y. Z. SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 2003, 31(13), 3692–3697.
- Kunik, V.; Solan, Z.; Edelman, S.; Ruppin, E.; Horn, D. Motif extraction and protein classification. Proc. IEEE Comput. Syst. Bioinf. Conf. 2005, 80–85.
- Johnson, J. M.; Church, G. M. Predicting ligand-binding function in families of bacterial receptors. Proc. Natl. Acad. Sci. USA 2000, 97(8), 3965–3970.
- Kalinina, O. V.; Novichkov, P. S.; Mironov, A. A.; Gelfand, M. S.; Rakhmaninova, A. B. SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins. Nucleic Acids Res. 2004, 32(Web Server issue), W424–W428.
-
Gasteiger, J.; Engel, T. Chemoinformatics. Wiley-VCH, Weinheim, Germany, 2003.
10.1002/3527601643 Google Scholar
- Kotera, M.; Okuno, Y.; Hattori, M.; Goto, S.; Kanehisa, M. Computational assignment of the EC numbers for genomic-scale analysis of enzymatic reactions. J. Am. Chem. Soc. 2004, 126(50), 16487–16498.
- Bender, A.; Mussa, H. Y.; Glen, R. C.; Reiling, S. Similarity searching of chemical databases using atom environment descriptors (MOLPRINT 2D): evaluation of performance. J. Chem. Inf. Comput. Sci. 2004, 44(5), 1708–1718.
- White, R. H. The difficult road from sequence to function. J. Bacteriol. 2006, 188(10), 3431–3432.
- Gärtner, T.; Flach, P.; Wrobel, S. On graph kernels: hardness results and efficient alternatives. In Learning Theory and Kernel Machines. 2003, p. 129.
- Kashima, H.; Tsuda, K.; Inokuchi, A. In Marginalized Kernels Between Labeled Graphs, Proceedings of the Twentieth International Conference on Machine Learning, Washington DC, Aug. 21–24, 2003; T. Fawcett; N. Mishra, Eds. AAAI Press, Washington DC, 2003, pp. 321–328.
- Mahe, P.; Ralaivola, L.; Stoven, V.; Vert, J. P. The pharmacophore kernel for virtual screening with support vector machines. J. Chem. Inf. Model. 2006, 46(5), 2003–2014.
- Swamidass, S. J.; Chen, J.; Bruand, J.; Phung, P.; Ralaivola, L.; Baldi, P. Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 2005, 21(Suppl. 1), i359–i368.
- Ben-Hur, A.; Noble, W. S. Kernel methods for predicting protein–protein interactions. Bioinformatics 2005, 21(Suppl. 1), i38–i46.
- Helma, C.; King, R. D.; Kramer, S.; Srinivasan, A. The predictive toxicology challenge 2000–2001. Bioinformatics 2001, 17(1), 107–108.
- Kramer, S.; De Raedt, L. In Feature Construction with Version Spaces for Biochemical Applications, Eighteenth International Conference on Machine Learning, San Francisco, 2001. Morgan Kaufmann, San Francisco, 2001, pp. 258–265.
- Webb, E. C. Enzyme Nomenclature 1992: Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. Academic Press, San Diego, CA, 1992.
- Churchwell, C. J.; Rintoul, M. D.; Martin, S.; Visco, D. P., Jr.; Kotu, A.; Larson, R. S.; Sillerud, L. O.; Brown, D. C.; Faulon, J. L. The signature molecular descriptor: 3. Inverse-quantitative structure–activity relationship of ICAM-1 inhibitory peptides. J. Mol. Graph. Model. 2004, 22(4), 263–273.