Graphs: Flexible Representations of Molecular Structures and Biological Networks

Milind Misra,

Milind Misra

Advanced Device Technologies, Sandia National Laboratories, Albuquerque, New Mexico, USA

Search for more papers by this author

Shawn Martin,

Shawn Martin

Computer Science and Informatics, Sandia National Laboratories, Albuquerque, New Mexico, USA

Search for more papers by this author

Jean-Loup Faulon,

Jean-Loup Faulon

Institute of Systems & Synthetic Biology, CNRS, University of Evry, France

Search for more papers by this author

Milind Misra,

Milind Misra

Advanced Device Technologies, Sandia National Laboratories, Albuquerque, New Mexico, USA

Search for more papers by this author

Shawn Martin,

Shawn Martin

Computer Science and Informatics, Sandia National Laboratories, Albuquerque, New Mexico, USA

Search for more papers by this author

Jean-Loup Faulon,

Jean-Loup Faulon

Institute of Systems & Synthetic Biology, CNRS, University of Evry, France

Search for more papers by this author

Book Editor(s):Rajarshi Guha,

Rajarshi Guha

NIH Chemical Genomics Center, Rockville, Maryland, USA

Search for more papers by this author

Andreas Bender,

Andreas Bender

Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK

Search for more papers by this author

First published: 14 November 2011

https://doi.org/10.1002/9781118131411.ch6

Summary

This chapter contains sections titled:

Introduction
Chemical Signature: Molecular Design and QSAR/QSPR
Protein Signature: Prediction of Protein–Protein Interactions
Protein–Chemical Signature: Predicting Enzyme–Metabolite and Drug–Target Interactions
Conclusions
References

REFERENCES

Kier, L. B.; Hall, L. H. Intermolecular accessibility: the meaning of molecular connectivity. J. Chem. Inf. Comput. Sci. 2000, 40, 792–795.
10.1021/ci990135s
CAS PubMed Web of Science® Google Scholar
Randic, M.; Zupan, J. On interpretation of well-known topological indices. J. Chem. Inf. Comput. Sci. 2001, 41, 550–560.
10.1021/ci000095o
CAS PubMed Web of Science® Google Scholar
Brown, R. D.; Martin, Y. C. The information content in 2D and 3D structural descriptors relevant to ligand–receptor binding. J. Chem. Inf. Comput. Sci. 1997, 37, 1–9.
10.1021/ci960373c
CAS Web of Science® Google Scholar
Randić, M.; Basak, S. C. A new descriptor for structure–property and structure–activity correlations. J. Chem. Inf. Comput. Sci. 2001, 41, 650–656.
10.1021/ci000116e
CAS PubMed Web of Science® Google Scholar
Faulon, J.-L. Stochastic generator of chemical structure: 1. Application to the structure elucidation of large molecules. J. Chem. Inf. Comput. Sci. 1994, 34, 1204–1218.
10.1021/ci00021a031
CAS Web of Science® Google Scholar
Visco, D. P., Jr.; Pophale, R. S.; Rintoul, M. D.; Faulon, J.-L. Developing a methodology for an inverse quantitative structure–activity relationship using the signature molecular descriptor. J. Mol. Graph. Model. 2002, 20, 429–438.
10.1016/S1093-3263(01)00144-9
CAS PubMed Web of Science® Google Scholar
Faulon, J. L.; Visco, D. P., Jr.; Pophale, R. S. The signature molecular descriptor: 1. Using extended valence sequences in QSAR and QSPR studies. J. Chem. Inf. Comput. Sci. 2003, 43(3), 707–720.
10.1021/ci020345w
CAS PubMed Web of Science® Google Scholar
Faulon, J.-L. Isomorphism, automorphism partitioning, and canonical labeling can be solved in polynomial-time for molecular graphs. J. Chem. Inf. Comput. Sci. 1998, 38, 432–444.
10.1021/ci9702914
CAS Web of Science® Google Scholar
Faulon, J. L.; Collins, M. J.; Carr, R. D. The signature molecular descriptor: 4. Canonizing molecules using extended valence sequences. J. Chem. Inf. Comput. Sci. 2004, 44(2), 427–436.
10.1021/ci0341823
CAS PubMed Web of Science® Google Scholar
Hall, L. H. MOLCONN-Z. Hall Associates Consulting, Quincy, MA, 1991.
Google Scholar
Kier, L. B.; Hall, L. H. Molecular Structure Description. Academic Press, San Diego, CA, 1999.
Google Scholar
Draper, N. R.; Smith, H. Applied Regression Analysis, 2nd ed. Wiley, New York, 1981.
Google Scholar
Tong, W.; Lowis, D. R.; Perkins, R.; Chen, Y.; Welsh, W. J.; Goddette, D. W.; Heritage, T. W.; Sheehan, D. M. Evaluation of quantitative structure–activity relationship methods for large-scale prediction of chemicals binding to the estrogen receptor. J. Chem. Inf. Comput. Sci. 1998, 38(4), 669–677.
10.1021/ci980008g
CAS PubMed Web of Science® Google Scholar
Martin, S.; Roe, D.; Faulon, J. L. Predicting protein-protein interactions using signature products. Bioinformatics 2005, 21(2), 218–226.
10.1093/bioinformatics/bth483
CAS PubMed Web of Science® Google Scholar
Shoemaker, B. A.; Panchenko, A. R. Deciphering protein–protein interactions: I. Experimental techniques and databases. PLoS Comput. Biol. 2007, 3(3), e42.
10.1371/journal.pcbi.0030042
CAS PubMed Web of Science® Google Scholar
Fields, S.; Song, O. A novel genetic system to detect protein–protein interactions. Nature 1989, 340(6230), 245–246.
10.1038/340245a0
CAS PubMed Web of Science® Google Scholar
Ho, Y.; Gruhler, A.; Heilbut, A.; Bader, G. D.; Moore, L.; Adams, S. L.; Millar, A.; Taylor, P.; Bennett, K.; Boutilier, K.; et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415(6868), 180–183.
10.1038/415180a
CAS PubMed Web of Science® Google Scholar
Zhu, H.; Bilgin, M.; Bangham, R.; Hall, D.; Casamayor, A.; Bertone, P.; Lan, N.; Jansen, R.; Bidlingmaier, S.; Houfek, T.; et al. Global analysis of protein activities using proteome chips. Science 2001, 293(5537), 2101–2105.
10.1126/science.1062191
CAS PubMed Web of Science® Google Scholar
Shoemaker, B. A.; Panchenko, A. R. Deciphering protein–protein interactions: II. Computational methods to predict protein and domain interaction partners. PLoS Comput. Biol. 2007, 3(4), e43.
10.1371/journal.pcbi.0030043
CAS PubMed Web of Science® Google Scholar
Dandekar, T.; Snel, B.; Huynen, M.; Bork, P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 1998, 23(9), 324–328.
10.1016/S0968-0004(98)01274-2
CAS PubMed Web of Science® Google Scholar
Enright, A. J.; Iliopoulos, I.; Kyrpides, N. C.; Ouzounis, C. A. Protein interaction maps for complete genomes based on gene fusion events. Nature 1999, 402(6757), 86–90.
10.1038/47056
CAS PubMed Web of Science® Google Scholar
Marcotte, E. M.; Pellegrini, M.; Ng, H. L.; Rice, D. W.; Yeates, T. O.; Eisenberg, D. Detecting protein function and protein–protein interactions from genome sequences. Science 1999, 285(5428), 751–753.
10.1126/science.285.5428.751
CAS PubMed Web of Science® Google Scholar
Pazos, F.; Helmer-Citterich, M.; Ausiello, G.; Valencia, A. Correlated mutations contain information about protein–protein interaction. J. Mol. Biol. 1997, 271(4), 511–523.
10.1006/jmbi.1997.1198
CAS PubMed Web of Science® Google Scholar
Goh, C. S.; Bogan, A. A.; Joachimiak, M.; Walther, D.; Cohen, F. E. Coevolution of proteins with their interaction partners. J. Mol. Biol. 2000, 299(2), 283–293.
10.1006/jmbi.2000.3732
CAS PubMed Web of Science® Google Scholar
Kumar, A.; Agarwal, S.; Heyman, J. A.; Matson, S.; Heidtman, M.; Piccirillo, S.; Umansky, L.; Drawid, A.; Jansen, R.; Liu, Y.; et al. Subcellular localization of the yeast proteome. Genes Dev 2002, 16(6), 707–719.
10.1101/gad.970902
CAS PubMed Web of Science® Google Scholar
Valencia, A.; Pazos, F. Computational methods for the prediction of protein interactions. Curr. Opin. Struct. Biol. 2002, 12(3), 368–373.
10.1016/S0959-440X(02)00333-0
CAS PubMed Web of Science® Google Scholar
Sprinzak, E.; Margalit, H. Correlated sequence-signatures as markers of protein–protein interaction. J. Mol. Biol. 2001, 311(4), 681–692.
10.1006/jmbi.2001.4920
CAS PubMed Web of Science® Google Scholar
Bock, J. R.; Gough, D. A. Predicting protein–protein interactions from primary structure. Bioinformatics 2001, 17(5), 455–460.
10.1093/bioinformatics/17.5.455
CAS PubMed Web of Science® Google Scholar
Bock, J. R.; Gough, D. A. Whole-proteome interaction mining. Bioinformatics 2003, 19(1), 125–134.
10.1093/bioinformatics/19.1.125
CAS PubMed Web of Science® Google Scholar
Noble, W. S. What is a support vector machine? Nat. Biotechnol. 2006, 24(12), 1565–1567.
10.1038/nbt1206-1565
CAS PubMed Web of Science® Google Scholar
Leslie, C.; Eskin, E.; Noble, W. S. The spectrum kernel: a string kernel for SVM protein classification. Pac. Symp. Biocomput. 2002, 564–575.
PubMed Google Scholar
Tong, A. H.; Drees, B.; Nardelli, G.; Bader, G. D.; Brannetti, B.; Castagnoli, L.; Evangelista, M.; Ferracuti, S.; Nelson, B.; Paoluzi, S.; et al. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 2002, 295(5553), 321–324.
10.1126/science.1064987
CAS PubMed Web of Science® Google Scholar
Xenarios, I.; Salwinski, L.; Duan, X. J.; Higney, P.; Kim, S. M.; Eisenberg, D. DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 2002, 30(1), 303–305.
10.1093/nar/30.1.303
CAS PubMed Web of Science® Google Scholar
Rain, J. C.; Selig, L.; De Reuse, H.; Battaglia, V.; Reverdy, C.; Simon, S.; Lenzen, G.; Petel, F.; Wojcik, J.; Schachter, V.; et al. The protein–protein interaction map of Helicobacter pylori. Nature 2001, 409(6817), 211–215.
10.1038/35051615
CAS PubMed Web of Science® Google Scholar
Apweiler, R.; Attwood, T. K.; Bairoch, A.; Bateman, A.; Birney, E.; Biswas, M.; Bucher, P.; Cerutti, L.; Corpet, F.; Croning, M. D.; et al. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res. 2001, 29(1), 37–40.
10.1093/nar/29.1.37
CAS PubMed Web of Science® Google Scholar
von Mering, C.; Krause, R.; Snel, B.; Cornell, M.; Oliver, S. G.; Fields, S.; Bork, P. Comparative assessment of large-scale data sets of protein–protein interactions. Nature 2002, 417(6887), 399–403.
10.1038/nature750
CAS PubMed Web of Science® Google Scholar
Sprinzak, E.; Sattath, S.; Margalit, H. How reliable are experimental protein–protein interaction data? J. Mol. Biol. 2003, 327(5), 919–923.
10.1016/S0022-2836(03)00239-0
CAS PubMed Web of Science® Google Scholar
Jansen, R.; Yu, H.; Greenbaum, D.; Kluger, Y.; Krogan, N. J.; Chung, S.; Emili, A.; Snyder, M.; Greenblatt, J. F.; Gerstein, M. A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 2003, 302(5644), 449–453.
10.1126/science.1087361
CAS PubMed Web of Science® Google Scholar
Faulon, J. L.; Misra, M.; Martin, S.; Sale, K.; Sapra, R. Genome scale enzyme–metabolite and drug–target interaction predictions using the signature molecular descriptor. Bioinformatics 2008, 24(2), 225–233.
10.1093/bioinformatics/btm580
CAS PubMed Web of Science® Google Scholar
Kanehisa, M.; Goto, S.; Hattori, M.; Aoki-Kinoshita, K. F.; Itoh, M.; Kawashima, S.; Katayama, T.; Araki, M.; Hirakawa, M. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006, 34(Database issue), D354–D357.
10.1093/nar/gkj102
CAS PubMed Web of Science® Google Scholar
Wishart, D. S.; Knox, C.; Guo, A. C.; Shrivastava, S.; Hassanali, M.; Stothard, P.; Chang, Z.; Woolsey, J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006, 34(Database issue), D668–D672.
10.1093/nar/gkj067
CAS PubMed Web of Science® Google Scholar
Austin, C. P.; Brady, L. S.; Insel, T. R.; Collins, F. S. NIH molecular libraries initiative. Science 2004, 306(5699), 1138–1139.
10.1126/science.1105511
CAS PubMed Web of Science® Google Scholar
Brooksbank, C.; Cameron, G.; Thornton, J. The European Bioinformatics Institute's data resources: towards systems biology. Nucleic Acids Res. 2005, 33(Database issue), D46–D53.
10.1093/nar/gki026
CAS PubMed Web of Science® Google Scholar
Chen, X.; Ji, Z. L.; Chen, Y. Z. TTD: therapeutic target database. Nucleic Acids Res. 2002, 30(1), 412–415.
10.1093/nar/30.1.412
CAS PubMed Web of Science® Google Scholar
Kellenberger, E.; Rodrigo, J.; Muller, P.; Rognan, D. Comparative evaluation of eight docking tools for docking and virtual screening accuracy. Proteins 2004, 57(2), 225–242.
10.1002/prot.20149
CAS PubMed Web of Science® Google Scholar
Warren, G. L.; Andrews, C. W.; Capelli, A. M.; Clarke, B.; LaLonde, J.; Lambert, M. H.; Lindvall, M.; Nevins, N.; Semus, S. F.; Senger, S.; et al. A critical assessment of docking programs and scoring functions. J. Med. Chem. 2006, 49(20), 5912–5931.
10.1021/jm050362n
CAS PubMed Web of Science® Google Scholar
Borgwardt, K. M.; Ong, C. S.; Schonauer, S.; Vishwanathan, S. V.; Smola, A. J.; Kriegel, H. P. Protein function prediction via graph kernels. Bioinformatics 2005, 21(Suppl. 1), i47–i56.
10.1093/bioinformatics/bti1007
CAS PubMed Web of Science® Google Scholar
Cai, C. Z.; Han, L. Y.; Ji, Z. L.; Chen, X.; Chen, Y. Z. SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 2003, 31(13), 3692–3697.
10.1093/nar/gkg600
CAS PubMed Web of Science® Google Scholar
Kunik, V.; Solan, Z.; Edelman, S.; Ruppin, E.; Horn, D. Motif extraction and protein classification. Proc. IEEE Comput. Syst. Bioinf. Conf. 2005, 80–85.
PubMed Web of Science® Google Scholar
Johnson, J. M.; Church, G. M. Predicting ligand-binding function in families of bacterial receptors. Proc. Natl. Acad. Sci. USA 2000, 97(8), 3965–3970.
10.1073/pnas.050580897
CAS PubMed Web of Science® Google Scholar
Kalinina, O. V.; Novichkov, P. S.; Mironov, A. A.; Gelfand, M. S.; Rakhmaninova, A. B. SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins. Nucleic Acids Res. 2004, 32(Web Server issue), W424–W428.
10.1093/nar/gkh391
CAS PubMed Web of Science® Google Scholar
Gasteiger, J.; Engel, T. Chemoinformatics. Wiley-VCH, Weinheim, Germany, 2003.
10.1002/3527601643
Google Scholar
Kotera, M.; Okuno, Y.; Hattori, M.; Goto, S.; Kanehisa, M. Computational assignment of the EC numbers for genomic-scale analysis of enzymatic reactions. J. Am. Chem. Soc. 2004, 126(50), 16487–16498.
10.1021/ja0466457
CAS PubMed Web of Science® Google Scholar
Bender, A.; Mussa, H. Y.; Glen, R. C.; Reiling, S. Similarity searching of chemical databases using atom environment descriptors (MOLPRINT 2D): evaluation of performance. J. Chem. Inf. Comput. Sci. 2004, 44(5), 1708–1718.
10.1021/ci0498719
CAS PubMed Web of Science® Google Scholar
White, R. H. The difficult road from sequence to function. J. Bacteriol. 2006, 188(10), 3431–3432.
10.1128/JB.188.10.3431-3432.2006
CAS PubMed Web of Science® Google Scholar
Gärtner, T.; Flach, P.; Wrobel, S. On graph kernels: hardness results and efficient alternatives. In Learning Theory and Kernel Machines. 2003, p. 129.
Google Scholar
Kashima, H.; Tsuda, K.; Inokuchi, A. In Marginalized Kernels Between Labeled Graphs, Proceedings of the Twentieth International Conference on Machine Learning, Washington DC, Aug. 21–24, 2003; T. Fawcett; N. Mishra, Eds. AAAI Press, Washington DC, 2003, pp. 321–328.
Google Scholar
Mahe, P.; Ralaivola, L.; Stoven, V.; Vert, J. P. The pharmacophore kernel for virtual screening with support vector machines. J. Chem. Inf. Model. 2006, 46(5), 2003–2014.
10.1021/ci060138m
CAS PubMed Web of Science® Google Scholar
Swamidass, S. J.; Chen, J.; Bruand, J.; Phung, P.; Ralaivola, L.; Baldi, P. Kernels for small molecules and the prediction of mutagenicity, toxicity and anti-cancer activity. Bioinformatics 2005, 21(Suppl. 1), i359–i368.
10.1093/bioinformatics/bti1055
CAS PubMed Web of Science® Google Scholar
Ben-Hur, A.; Noble, W. S. Kernel methods for predicting protein–protein interactions. Bioinformatics 2005, 21(Suppl. 1), i38–i46.
10.1093/bioinformatics/bti1016
CAS PubMed Web of Science® Google Scholar
Helma, C.; King, R. D.; Kramer, S.; Srinivasan, A. The predictive toxicology challenge 2000–2001. Bioinformatics 2001, 17(1), 107–108.
10.1093/bioinformatics/17.1.107
Web of Science® Google Scholar
Kramer, S.; De Raedt, L. In Feature Construction with Version Spaces for Biochemical Applications, Eighteenth International Conference on Machine Learning, San Francisco, 2001. Morgan Kaufmann, San Francisco, 2001, pp. 258–265.
Google Scholar
Webb, E. C. Enzyme Nomenclature 1992: Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enzymes. Academic Press, San Diego, CA, 1992.
Google Scholar
Churchwell, C. J.; Rintoul, M. D.; Martin, S.; Visco, D. P., Jr.; Kotu, A.; Larson, R. S.; Sillerud, L. O.; Brown, D. C.; Faulon, J. L. The signature molecular descriptor: 3. Inverse-quantitative structure–activity relationship of ICAM-1 inhibitory peptides. J. Mol. Graph. Model. 2004, 22(4), 263–273.
10.1016/j.jmgm.2003.10.002
CAS PubMed Web of Science® Google Scholar

Computational Approaches in Cheminformatics and Bioinformatics