Review

Approaches to Measure Chemical Similarity – a Review

Nina Nikolova

Procter and Gamble, Eurocor, Central Product Safety, 100 Temselaan, B-1853 Strombeek-Bever, Belgium, Fax 32 2 5683098, Tel 32 2 456 2076, Tel 32 2 456 801

Search for more papers by this author

Joanna Jaworska,

Joanna Jaworska

[email protected]

Procter and Gamble, Eurocor, Central Product Safety, 100 Temselaan, B-1853 Strombeek-Bever, Belgium, Fax 32 2 5683098, Tel 32 2 456 2076, Tel 32 2 456 801

Search for more papers by this author

Nina Nikolova,

Nina Nikolova

Procter and Gamble, Eurocor, Central Product Safety, 100 Temselaan, B-1853 Strombeek-Bever, Belgium, Fax 32 2 5683098, Tel 32 2 456 2076, Tel 32 2 456 801

Search for more papers by this author

Joanna Jaworska,

Joanna Jaworska

[email protected]

Procter and Gamble, Eurocor, Central Product Safety, 100 Temselaan, B-1853 Strombeek-Bever, Belgium, Fax 32 2 5683098, Tel 32 2 456 2076, Tel 32 2 456 801

Search for more papers by this author

First published: 23 January 2004

https://doi.org/10.1002/qsar.200330831

Citations: 324

About

PDF

Tools

Share a link

Email
Wechat
Bluesky

Abstract

Although the concept of similarity is a convenient for humans, a formal definition of similarity between chemical compounds is needed to enable automatic decision-making. The objective of similarity measures in toxicology and drug design is to allow assessment of chemical activities. The ideal similarity measure should be relevant to the activity of interest. The relevance could be established by exploiting the knowledge about fundamental chemical and biological processes responsible for the activity. Unfortunately, this knowledge is rarely available and therefore different approximations have been developed based on similarity between structures or descriptor values. Various methods are reviewed, ranging from two-dimensional, three-dimensional and field approaches to recent methods based on “Atoms in Molecules” theory. All these methods attempt to describe chemical compounds by a set of numerical values and define some means for comparison between them. The review provides analysis of potential pitfalls of this methodology – loss of information in the representations of molecular structures – the relevance of a particular representation and chosen similarity measure to the activity. A brief review of known methods for descriptor selection is also provided. The popular “neighborhood behavior” principle is criticized, since proximity with respect to descriptors does not necessarily mean proximity with respect to activity. Structural similarity should also be used with care, as it does not always imply similar activity, as shown by examples. We remind that similarity measures and classification techniques based on distances rely on certain data distribution assumptions. If these assumptions are not satisfied for a given dataset, the results could be misleading. A discussion on similarity in descriptor space in the context of applicability domain assessment of QSAR models is also provided. Finally, it is shown that descriptor based similarity analysis is prone to errors if the relationship between the activity and the descriptors has not been previously established. A justification for the usage of a particular similarity measure should be provided for every specific activity by expert knowledge or derived by data modeling techniques.

References

1 W. V. Quine, Natural kinds. In Ontological relativity and other essays, Columbia University Press, New York, NY, 1977.
Google Scholar
2 N. Goodman (Ed.), Seven structures on similarity. Problems and Projects, 437–447. Bobbs-Merril, New York, 1972.
Google Scholar
3 V. J. Gillet, D. J. Wild, P. Willett, J. Bradshaw, Similarity and Dissimilarity Methods for Processing Chemical Structure databases, Comput. J. 1998, 41, No.8
10.1093/comjnl/41.8.547
CAS Web of Science® Google Scholar
4 J. Bajorath, Virtual screening in drug discovery: Methods, expectations and reality, Current Drug Discovery, http://www.current-drugs.com/CDD/CDD/CDDPDF/issue2–03/BAJORATH.pdf (March 2002)
Google Scholar
5 Trends in Fragrance Research: About Structure-Odour Relationships, The BASICS archives, http://www.xs4all.nl/~bacis/bnb01081.html
Google Scholar
6 L. Turin, Y. Fumiko, Structure-odor relations: a modern perspective, http://www.physiol.ucl.ac.uk/research/turin l/review final.pdf
Google Scholar
7 A. McNaught, A. Wilkinson (Eds.), IUPAC Compendium of Chemical Terminology. The Gold Book, Second Edition, Blackwell Science 1997.
Google Scholar
8 H. Kubinyi, QSAR: Hansch Analysis and Related Approaches, VCH, Weinheim, 1993.
10.1002/9783527616824
Web of Science® Google Scholar
9 H. Kubinyi (Ed.), 3D QSAR in Drug Design: Theory, Methods and Applications, ESCOM Science Publishers B. V., Leiden, 1993.
Google Scholar
10 P. Willett, Chemoinformatics – similarity and diversity in chemical libraries, Analytical Biotechnology 2000, 11, 85–88.
CAS PubMed Web of Science® Google Scholar
11 M. Randic, On Characterization of Chemical Structure, J. Chem. Inf. Comput. Sci. 1997, 37, 672–672.
10.1021/ci960174t
CAS Web of Science® Google Scholar
12 H. Hosoya, M. Gotoh, M. Murakami, S. Ikeda, Topological Index and Thermodynamic Properties. 5. How Can We Explain the Topological Dependency of Thermodynamic Properties of Alkanes with the Topology of Graphs? J. Chem. Inf. Comput. Sci. 1999, 39, 192–196.
10.1021/ci980058l
CAS Web of Science® Google Scholar
13 H. Wiener, Structural determination of Paraffin Boiling Points, J. Am. Chem. Soc. 1947, 69, 17–20.
10.1021/ja01193a005
CAS PubMed Web of Science® Google Scholar
14 M. Randic, Characterization of Molecular Branching, J. Am. Chem. Soc. 1975, 97, 6609–6615.
10.1021/ja00856a001
CAS Web of Science® Google Scholar
15 D. Bonchev, N. Trinajstic, Information Theory, Distance Matrix, and Molecular Branching, J. Chem. Phys. 1977, 67, 4517–4533.
10.1063/1.434593
CAS Web of Science® Google Scholar
16 S. Basak, V. Magnuson, Determining structural similarity of chemicals using graph-theoretic indices, Discrete Appl. Math. 1988, 19, 17–44.
10.1016/0166-218X(88)90004-2
Web of Science® Google Scholar
17 A. Balaban, Topological indices based on topological distances in molecular graphs, Pure Appl. Chem. 1983, 55, 199–206.
10.1351/pac198855020199
CAS Web of Science® Google Scholar
18 M. Randic, On Characterization of Chemical Structure, J. Chem. Inf. Comput. Sci. 1997, 37, 672–672.
10.1021/ci960174t
CAS Web of Science® Google Scholar
19 P. Willett, J. Barnard, G. Downs, Chemical Similarity Searching, J. Chem. Inf. Comput. Sci. 1998, 38, 983–996.
10.1021/ci9800211
CAS Web of Science® Google Scholar
20 D. M. Bayada, H. Hamersma, V. J. van Geerestein, Molecular Diversity and Representativity in Chemical Databases, J. Chem. Inf. Comput. Sci. 1999, 39, 1–10.
10.1021/ci980109e
CAS Web of Science® Google Scholar
21 D. R. Flower, On the Properties of Bit String-Based Measures of Chemical Similarity, J. Chem. Inf. Comput. Sci. 1998, 38, 379–386.
10.1021/ci970437z
CAS Web of Science® Google Scholar
22 F. R. Burden, Molecular identification number for substructure searches, J. Chem. Inf. Comput. Sci. 1989, 29, 225–227.
10.1021/ci00063a011
CAS Web of Science® Google Scholar
23 R. S. Pearlman, Novel Software Tools for addressing Chemical Diversity http://www.netsci.org/Science/Combichem/feature08.html
Google Scholar
24 R. S. Pearlman, K. M. Smith, Metric Validation And The Receptor-Relevant Subspace Concept, J. Chem. Inf. Comput. Sci. 1999, 39, 28–35.
10.1021/ci980137x
CAS Web of Science® Google Scholar
25 J. M. Blaney, E. J. Martin, Computational approaches for combinatorial library design and molecular diversity analysis, Curr. Opin. Chem. Biol. 1997, 1, 54–59.
10.1016/S1367-5931(97)80108-1
CAS PubMed Web of Science® Google Scholar
26 F. R. Burden, D. A. Winkler, New QSAR Methods Applied to Structure-Activity Mapping and Combinatorial Chemistry, J. Chem. Inf. Comput. Sci. 1999, 39, 236–242.
10.1021/ci980070d
CAS Web of Science® Google Scholar
27 B. Mohar, Laplace eigenvalues of graph – a survey. Discrete Math. 1992, 109, 171–183.
10.1016/0012-365X(92)90288-Q
Web of Science® Google Scholar
28 W. Fisanick, K. Cross, A. Rusinko, Similarity Searching on CAS Registry Substances 1. Global Molecular Property and Generic Atom Triangle Geometric Searching, J. Chem. Inf. Comput. Sci. 1992, 32, 664–674.
10.1021/ci00010a013
CAS Web of Science® Google Scholar
29 P. Willett, Searching for pharmacophoric patterns in databases of three-dimensional chemical structures, J. Mol. Recognit. 1995, 8, 290–303.
10.1002/jmr.300080503
CAS PubMed Web of Science® Google Scholar
30 R. D. Cramer III, D. E. Patterson, J. D. Bunce, J. Am. Chem. Soc. 1988, 110, 5959–5967.
10.1021/ja00226a005
CAS PubMed Web of Science® Google Scholar
31 R. Carbo-Dorca, D. Robert, L. Amat, X. Girones, E. Besalu, University of Girona, Spain, Molecular Quantum Similarity in Qsar and Drug Design Coulson's Challenge Series, Lect. Notes Chem., Vol. 73
Google Scholar
32 R. Carbo, L. Leyda, M. Arnau, How Similar is a Molecule to Another? An Electron Density Measure of Similarity Between two Molecular Structures, Int. J. Quantum Chem. 1980, 17, 1185–1189.
10.1002/qua.560170612
CAS Web of Science® Google Scholar
33 E. E. Hodgkin, W. G. Richards, Molecular Similarity Based on Electrostatic Potential and Electric Field, Int. J . Quantum Chem. 1987, 14, 105–110.
10.1002/qua.560320814
CAS Google Scholar
34 E. E. Hodgkin, W. G. Richards, A Semi-Empirical Method for Calculating Molecular Similarity, Chem. Commun. 1986, 19, 1342–1344.
10.1039/c39860001342
Web of Science® Google Scholar
35 M. Manaut, F. Sanz, J. Jose, M. Milesi, Automatic Search for Maximum Similarity between Molecular Electrostatic Potential Distributions, J. Comput.-Aided Mol. Design 1991, 5, 1–380.
10.1007/BF00126669
Web of Science® Google Scholar
36 J. Cioslowski, E. D. Fleischmann, Assessing Molecular Similarity from Results of ab Initio Electronic Structure Calculations, J. Am. Chem. Soc. 1991, 113, 64–67.
10.1021/ja00001a012
CAS Web of Science® Google Scholar
37 A. M. Meyer, W. G. Richards, Similarity of Molecular Shape, J. Comput.-Aided Mol. Design 1991, 5, 426–439.
10.1007/BF00125663
Web of Science® Google Scholar
38 A. C. Good, W. G. Richards, Rapid Evaluation of Shape Similarity Using Gaussian Functions, J. Chem. Inf. Comput. Sci. 1993, 33, 112–116.
10.1021/ci00011a016
CAS Web of Science® Google Scholar
39 B. D. Silverman, D. E. Platt, Comparative Molecular Moment Analysis (CoMMA): 3D-QSAR without Molecular Superposition, J. Med. Chem. 1996, 39, 2129–2140.
10.1021/jm950589q
CAS PubMed Web of Science® Google Scholar
40 R. Bursi, T. Dao, T. van Wijk, M. de Gooyer, E. Kellenbach, P. Verwer, Comparative Spectra Analysis (CoSA): Spectra as Three-Dimensional Molecular Descriptors for the Prediction of Biological Activities, J. Chem. Inf. Comput. Sci. 1999, 39, 861–867.
10.1021/ci990038z
CAS PubMed Web of Science® Google Scholar
41 D. B. Turner, P. Willett, A. M. Ferguson, T. W. Heritage, Evaluation of a novel molecular vibration-based descriptor (EVA) for QSAR studies: 2. Model validation using a benchmark steroid dataset, J. Comput.-Aided Mol. Design 1999, 13, 271–296.
10.1023/A:1008012732081
CAS PubMed Web of Science® Google Scholar
42 H. Patel, M. T. D. Cronin, A Novel Index for the Description of Molecular Linearity, J. Chem. Inf. Comput. Sci. 2001, 41, 1228–1236.
10.1021/ci0103673
CAS PubMed Web of Science® Google Scholar
43 L. B. Kier, A Shape Index from Molecular Graphs, Quant. Struct.-Act. Relat. 1985, 4, 109–116.
10.1002/qsar.19850040303
CAS Web of Science® Google Scholar
44 R. W. Taft, Polar and Steric Substituent Constants for Aliphatic and o-Benzoate Groups from Rates of Esterification and Hydrolysis of Esters, J. Am. Chem. Soc. 1952, 74, 3120–3128.
10.1021/ja01132a049
CAS Web of Science® Google Scholar
45 R. W. Taft, The General Nature of the Proportionality of Polar Effects of Substituent Groups in Organic Chemistry, J. Am. Chem. Soc. 1953, 75, 4231–4238.
10.1021/ja01113a027
CAS Web of Science® Google Scholar
46 A. Verloop, The STERIMOL Approach to Drug Design, Marcel Dekker, New York, 1987.
Google Scholar
47 D. E. Walters, A. J. Hopfinger, Case studies of the application of molecular shape analysis to elucidate drug action, J. Mol. Struct.: THEOCHEM, 1986, 134, 317–323.
10.1016/0166-1280(86)80004-5
Google Scholar
48 B. B. Goldman, W. T. Wipke, Quadratic Shape Descriptors. 1. Rapid Superposition of Dissimilar Molecules Using Geometrically Invariant Surface Descriptors, J. Chem. Inf. Comput. Sci. 2000, 40, 644–658.
10.1021/ci980213w
CAS PubMed Web of Science® Google Scholar
49 J. S. Duca, A. J. Hopfinger, Estimation of Molecular Similarity Based on 4D-QSAR Analysis: Formalism and Validation, J. Chem. Inf. Comput. Sci. 2001, 41, 1367–1387.
10.1021/ci0100090
CAS PubMed Web of Science® Google Scholar
50 R. Todeschini, P. Gramatica, 3D-Modelling and prediction by WHIM descriptors. Part 5. Theory Development and Chemical Meaning of WHIM descriptors, Quant. Struct.-Act. Relat. 1997, 16, 113–119.
10.1002/qsar.19970160203
CAS Web of Science® Google Scholar
51 D. R. Lide, CRC Handbook of Chemistry and Physics, 83rd Edition, National Institute of Standards & Technology, USA, CRC Press.
Google Scholar
52 H. Kubinyi, QSAR: Hansch Analysis and Related Approaches, in: Methods and Principles in Medicinal Chemistry, Vol.1, R. Manhold, P. Krogsgaard-Larsen, H. Timmermann (Eds.), VCH, Weinheim, 1993, pp. 21–36.
Google Scholar
53 M. A. Johnson, G. M. Maggiora (Eds.), Concepts and Applications of Molecular Similarity, Wiley, New York, 1990.
Google Scholar
54 P. M. Dean (Ed.), Molecular Similarity in Drug Design, Chapman & Hall, New York, 1995.
Google Scholar
55 K. Sen (Ed.), Molecular Similarity I and II, Topics Curr. Chem. 1995, 173–174
Google Scholar
56 J. M. Blaney, E. J. Martin, Computational approaches for combinatorial library design and molecular diversity analysis, Curr. Opin. Chem. Biol. 1997, 1, 54–59.
10.1016/S1367-5931(97)80108-1
CAS PubMed Web of Science® Google Scholar
57 P. Willett, Similarity and Clustering in Chemical Information Systems, Research Studies Press, Letchworth, 1987.
Google Scholar
58 R. D. Brown, Y. C. Martin, Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection, J. Chem. Inf. Comput. Sci. 1996, 36, 572–584.
10.1021/ci9501047
CAS Web of Science® Google Scholar
59 M. Karelson, V. S. Lobanov, A. R. Katritzky, Quantum-Chemical Descriptors in QSAR/QSPR Studies, Chem. Rev. 1996, 96, 1027–1044.
10.1021/cr950202r
CAS PubMed Web of Science® Google Scholar
60 P. E. Bowen-Jenkins, W. G. Richards, Quantitative Measures of Similarity between Pharmacologically Active Compounds, Int. J. Quantum Chem. 1986, 30, 763–768.
10.1002/qua.560300605
CAS Web of Science® Google Scholar
61 G. Boon, W. Langenaeker, F. De Proft, H. De Winter, J. P. Tollenaere, P. Geerlings, Systematic Study of the Quality of Various Quantum Similarity Descriptors. Use of the Autocorrelation Function and Principal Component Analysis, J. Phys. Chem. A 2001, 105, 8805–8814.
10.1021/jp011441n
CAS Web of Science® Google Scholar
62 R. F. W. Bader, Atoms in Molecules: A Quantum Theory, Clarendon Press, 1990.
10.1093/oso/9780198551683.001.0001
Google Scholar
63 R. F. W. Bader, S. G. Anderson, A. J. Duke, Quantum Topology of Molecular Charge Distributions. 1, J. Am. Chem. Soc. 1979, 101, 1389–1395.
10.1021/ja00500a006
CAS Web of Science® Google Scholar
64 P. L. A. Popelier, Atoms in Molecules: an introduction, , (Ed.), London, 2000.
10.1039/9781847553317-00143
Google Scholar
65 P. L. A. Popelier, Quantum Molecular Similarity. 1. BCP Space, J. Phys. Chem. A 1999, 103, 2883–2890.
10.1021/jp984735q
CAS Web of Science® Google Scholar
66 S. E. O'Brien, P. L. A. Popelier, Quantum molecular similarity. Part 2: The relation between properties in BCP space and bond length, Can. J. Chem. 1999, 77, 28–36.
10.1139/v98-215
CAS Web of Science® Google Scholar
67 S. E. O'Brien and P. L. A. Popelier, Quantum Molecular Similarity. 3. QTMS Descriptors, J. Chem. Inf. Comput. Sci. 2001, 41, 764–775.
10.1021/ci0004661
CAS PubMed Web of Science® Google Scholar
68 P. L. A. Popelier, P. J. Smith, Quantum Topological Atoms, in Chemical Modelling: Applications and Theory, Vol. 2, A. Hinchliffe (Ed.), Royal Society of Chemistry Specialist, Periodical Report, 2002, pp. 391–448.
Google Scholar
69 S. E. O'Brien, P. L. A. Popelier, Quantum Molecular Similarity. Part 4: Anti-Tumour Activity of Phenylbutenones, Perkin Trans. II 2002, 478–483.
10.1039/b110080g
CAS Web of Science® Google Scholar
70 P. L. A. Popelier, U. A. Chaudry, P. J. Smith, Quantum Topological Molecular Similarity. Part 5: Further Development with an Application to the toxicity of Polychlorinated dibenzo-p-dioxins (PCDDs), Perkin Trans.II 2002, 1231–1237.
10.1039/b203412c
CAS Web of Science® Google Scholar
71 C. B. Mazza, N. Sukumar, C. M. Breneman, S. M. Cramer, Prediction of Protein Retention in Ion-Exchange Systems Using Molecular Descriptors Obtained from Crystal Structure, Anal. Chem. 2001, 73, 5457–5461.
10.1021/ac010797s
CAS PubMed Web of Science® Google Scholar
72 http://www.chem.rpi.edu/chemweb/recondoc/WinRecon.html
Google Scholar
73 A. Hinchliffe (Ed.), Chemical Modelling: Applications and Theory, Vol. 1, Royal Society of Chemistry, Cambridge, 2000.
Google Scholar
74 P. G. Mezey, Theorems on Molecular Shape-Similarity Descriptors: External T-Plasters and Interior T-Aggregates, J. Chem. Inf. Comput. Sci. 1996, 36, 1076–1081.
10.1021/ci9600263
CAS Web of Science® Google Scholar
75 P. G. Mezey, Shape Analysis, in Encyclopedia of Computational Chemistry, Vol. 4, (Eds.), John Wiley & Sons, Chichester, UK, 1999, pp. 2582–2589.
Google Scholar
76 P. G. Mezey, Shape in Chemistry: An Introduction to Molecular Shape and Topology, VCH Publishers, New York, 1993.
Google Scholar
77 P. G. Mezey, The Shape of Molecular Charge Distributions: Group Theory without Symmetry, J. Comput. Chem. 1987, 8, 462–469.
10.1002/jcc.540080426
CAS Web of Science® Google Scholar
78 P. D. Walker, G. A. Arteca, P. G. Mezey, A Complete Shape Group Characterization for Molecular Charge Densities Represented by Gaussian-Type Functions, J. Compur. Chem. 1990, 12, 220–230.
10.1002/jcc.540120212
Web of Science® Google Scholar
79 P. G. Mezey, Z. Zimpel, P. Warburton, P. D. Walker, D. G. Irvine, D. G. Dixon, B. Greenberg, High-Resolution Shape-Fragment MEDLA Database for Toxicological Shape Analysis of PAHs, J. Chem. Inf. Comput. Sci. 1996, 36, 602–611.
10.1021/ci9501610
CAS Web of Science® Google Scholar
80 P. G. Mezey, Local and Global Similarities of Molecules: Electron Density Theorems, Computational Aspects, and Applications, European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS 2000, Barcelona (11–14 September 2000).
Google Scholar
81 G. A. Arteca, V. B. Jammal, P. G. Mezey, Shape Group Studies of Molecular Similarity and Regioselectivity in Chemical Reactions, J. Comput. Chem. 1988, 9, 608–619.
10.1002/jcc.540090606
CAS Web of Science® Google Scholar
82 A. Lawson, Organic reaction similarity in information processing, J. Chem. Inf. Comput. Sci. 1992, 32, 675–679.
Web of Science® Google Scholar
83 R. Ponec, M. Strnad, Similarity ideas in the theory of pericyclic reactivity, J. Chem. Inf. Comput. Sci. 1992, 32, 693–699.
10.1021/ci00010a017
CAS Web of Science® Google Scholar
84 G. Sello, Reaction prediction: the suggestions of the Beppe program, J. Chem. Inf. Comput. Sci. 1992, 32, 713–717.
10.1021/ci00010a019
CAS Web of Science® Google Scholar
85 J. Gasteiger, W. D. Ihlenfeldt, R. Fick, J. R. Rose, Similarity concepts for the planning of organic reactions and syntheses, J. Chem. Inf. Comput. Sci. 1992, 32, 700–712.
10.1021/ci00010a018
CAS Web of Science® Google Scholar
86 Y. C. Martin, Diverse Viewpoints on Computational Aspects of Molecular Diversity, J. Comb. Chem. 2001, 3, 231–250.
10.1021/cc000073e
CAS PubMed Web of Science® Google Scholar
87 R. O. Duda, P. E. Hart, Pattern Classification and Scene Analysis, John Wiley, New York, 1973.
Google Scholar
88 K. Fukunaga, Introduction to Statistical Pattern Recognition, Academic Press, 1990.
10.1016/B978-0-08-047865-4.50011-9
Google Scholar
89 S. Haykin, Neural networks. A comprehensive foundation, Macmillan/IEEE Press, 1994.
Google Scholar
90 C. A. Lipinski, F. Lombardo, B. W. Dominy, P. J. Feeney, Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings, Adv. Drug Deliv. Rev. 1997, 23, 3–25.
10.1016/S0169-409X(96)00423-1
CAS Web of Science® Google Scholar
91 D. K. Agrafiotis, V. Lobanov, F. Salemme, Combinatorial informatics in the post-genomic era, Nature Rev. 2002, www.nature.com/reviews/drugdisc
Google Scholar
92 M. D. Barratt, J. V. Castell, M. Chamberlain, R. D. Combes, J. C. Dearden, J. H. Fentem, I. Gerner, A. Giuliani, T. J. B. Gray, D. J. Livingstone, W. McLean Provan, F. J. J. A. L. Rutten, H. J. M. Verhaar, P. Zbinden, The Integrated Use of Alternative Approaches for Predicting Toxic Hazard The Report and Recommendations of ECVAM Workshop 8, http://altweb.jhsph.edu/publications/ECVAM/ecvam08.htm
Google Scholar
93 A. Burger, Isosterism and bioisosterism in drug design, Prog. Drug. Res. 1991, 37, 287–371.
CAS PubMed Google Scholar
94 G. A. Patani, E. J. LaVoie, Bioisosterism: A rational approach in drug design, Chem. Rev. 1996, 96, 3147–3176.
10.1021/cr950066q
CAS PubMed Web of Science® Google Scholar
95 H. Kubinyi, Similarity and Dissimilarity – A Medicinal Chemist's View, in 3D QSAR in Drug Design. Volume II. Ligand-Protein Interactions and Molecular Similarity, H. Kubinyi, G. Folkers, Y. C. Martin (Eds.), Kluwer/ESCOM, Dordrecht, 1998, pp. 225–252; also published in: Persp. Drug Design Discov. 1998, 9/10/11, 225–252.
Web of Science® Google Scholar
96 H. Kubinyi, Chemical Similarity and Biological Activity, 3rd Workshop on Chemical Structure and Biological Activity: Perspectives on QSAR 2001 (November 8–10, 2001) Sao Paolo, Brazil, http://arara.iq.usp.br/l6.htm
Google Scholar
97 H. Kybinyi, Chemical Similarity and Biological activity. Hugo Kubinyi Lectures, http://home.t-online.de/home/kubinyi/dd-06.pdf
Google Scholar
98 ICCA Workshop “(Q)SARS For Human Health And The Environment: Workshop on Regulatory Acceptance, Setubal, Portugal, March 4–6, 2002.
Google Scholar
99 T. Potter, H. Matter, Random or Rational Design? Evaluation of Diverse Compound Subsets from Chemical Structure Databases, J. Med. Chem. 1998, 41, 478–488.
10.1021/jm9700878
CAS PubMed Web of Science® Google Scholar
100 R. A. Lewis, J. S. Mason, I. M. McLay, Similarity Measures for Rational Set Selection and Analysis of Combinatorial Libraries: The Diverse Property-Derived (DPD) Approach, J. Chem. Inf. Comput. Sci. 1997, 37, 599–614.
10.1021/ci960471y
CAS PubMed Web of Science® Google Scholar
101 R. D. Brown, Y. C. Martin, The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding, J. Chem. Inf. Comput. Sci. 1997, 37, 1–9.
10.1021/ci960373c
CAS Web of Science® Google Scholar
102 G. M. Downs, P. Willett, W. Fisanick, Similarity Searching and Clustering of Chemical Structure Databases using Molecular Property Data, J. Chem. Inf. Comput. Sci. 1994, 34, 1094–1102.
10.1021/ci00021a011
CAS Web of Science® Google Scholar
103 S. K. Kearsley, S. Sallamack, E. M. Fluder, J. D. Andose, R. T. Mosley, R. P. Sheridan, Chemical Similarity Using Physiochemical Property Descriptors, J. Chem. Inf. Comput. Sci. 1996, 36, 118–127.
10.1021/ci950274j
CAS Web of Science® Google Scholar
104 M. Dash, H. Liu, Feature Selection for Classification, Intell. Data Anal. 1997, 1, 131–156.
10.3233/IDA-1997-1302
Google Scholar
105 M. Hall, Correlation-based feature selection of discrete and numeric class machine learning, in Proceedings of the International Conference on Machine Learning, Morgan Kaufmann Publishers, San Francisco, CA, 2000, pp. 359–366.
Google Scholar
106 P. M. Narendra, K. Fukunaga, A branch and bound algorithm for feature selection. IEEE T. Comp. 1977, C-29, 917–922.
10.1109/TC.1977.1674939
Google Scholar
107 K. Kira, L. A. Rendell, The feature selection problem: Traditional methods and a new algorithm, in Proceedings of Ninth National Conference on Artificial Intelligence, 1992, pp. 129–134.
Google Scholar
108 I. Kononenko, Estimating attributes: Analysis and extension of RELIEF, in Proceedings of European Conference on Machine Learning, Morgan Kaufmann, 1994, pp. 171–182.
Google Scholar
109 R. Kohavi, D. Sommerfield, Feature subset selection using the wrapper method: Overfitting and dynamic search space topology, in Proceedings of First International Conference on Knowledge Discovery and Data Mining, Morgan Kaufmann, 1995, pp. 192–197.
Google Scholar
110 R. J. Hilderman, H. J. Hamilton, Heuristic measures of interestingness, in J. Zytkov, J. Rauch (Eds.), Proceedings of the 3rd European Conference on the Principles of Data Mining and Knowledge Discovery (PKDD'99), 1999, pp. 232–241.
Google Scholar
111 S. K. Lin, Molecular diversity assessment: logarithmic relations of information and species diversity and logarithmic relations of entropy and indistinguishability after rejection of Gibbs paradox of entropy mixing, Molecules 1996, 1, 57–67.
10.1007/s007830050010
CAS Google Scholar
112 D. K. Agrafiotis, On the Use of Information Theory for Assessing Molecular Diversity, J. Chem. Inf. Comput. Sci. 1997, 37, 576–580.
10.1021/ci960156b
CAS Web of Science® Google Scholar
113 R. D. Cramer, D. E. Patterson, R. D. Clark, F. Soltanashahi, M. Lawless, Virtual Compound Libraries: A New Approach to Decision Making in Molecular Discovery Research, J. Chem. Inf. Comput. Sci. 1998, 38, 1010–1023.
10.1021/ci9800209
CAS Web of Science® Google Scholar
114 Y. C. Martin, R. D. Brown, M. G. Bures, Quantifying diversity, in: E. M. Gordon, J. F. Kerwin Jr. (Eds.), Combinatorial Chemistry and Molecular Diversity in Drug Discovery, Wiley, 1998, pp. 369–385.
Google Scholar
115 D. E. Patterson, R. D. Cramer, A. M. Ferguson, R. D. Clark, Weinberger Neighborhood Behavior: A Useful Concept for Validation of “Molecular Diversity” Descriptors, J. Med. Chem. 1996, 39, 3049–3059.
10.1021/jm960290n
CAS PubMed Web of Science® Google Scholar
116 R. D. Clark, R. D. Cramer, Taming the combinatorial centipede, CHEMTECH 1997, 27, 24–30.
CAS Web of Science® Google Scholar
117 W. M. Meylan, P. H. Howard, R. S. Boethling, D. Aronson, H. Printup, S. Gouchi, Improved Method for Estimating Bioconcentration / Bioaccumulation Factor from Octanol/Water Partition Coefficient, Environ. Toxicol. Chem. 1996, 18, 664–672.
Web of Science® Google Scholar
118 M. Skvortsova, I. Baskin, Molecular Similarity. 1. Analytical Description of the Set of Graph Similarity Measures, J. Chem. Inf. Comput. Sci. 1998, 38, 785–790.
10.1021/ci970037b
CAS Web of Science® Google Scholar

Citing Literature

Volume22, Issue9-10

January 2004

Pages 1006-1026

Approaches to Measure Chemical Similarity – a Review

Abstract

References

Citing Literature

References

Information

About Wiley Online Library

Help & Support

Opportunities

Connect with Wiley

Approaches to Measure Chemical Similarity – a Review

Abstract

References

Citing Literature

References

Related

Information