Chemical Descriptors Are More Important Than Learning Algorithms for Modelling
Corresponding Author
S. Stanley Young
National Institute of Statistical Sciences, 19 T. W. Alexander Drive, P. O. Box 14006, Research Triangle Park, NC 27709-4006, USA
National Institute of Statistical Sciences, 19 T. W. Alexander Drive, P. O. Box 14006, Research Triangle Park, NC 27709-4006, USASearch for more papers by this authorFei Yuan
Population Health Research Institute, McMaster Clinic, Hamilton General Hospital, 237 Barton Street East, Hamilton, Ontario, Canada L8L 2X2
Search for more papers by this authorMu Zhu
Department of Statistics and Actuarial Science, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1
Search for more papers by this authorCorresponding Author
S. Stanley Young
National Institute of Statistical Sciences, 19 T. W. Alexander Drive, P. O. Box 14006, Research Triangle Park, NC 27709-4006, USA
National Institute of Statistical Sciences, 19 T. W. Alexander Drive, P. O. Box 14006, Research Triangle Park, NC 27709-4006, USASearch for more papers by this authorFei Yuan
Population Health Research Institute, McMaster Clinic, Hamilton General Hospital, 237 Barton Street East, Hamilton, Ontario, Canada L8L 2X2
Search for more papers by this authorMu Zhu
Department of Statistics and Actuarial Science, University of Waterloo, 200 University Avenue West, Waterloo, Ontario, Canada N2L 3G1
Search for more papers by this authorReferences
- 1N. Cristianini, J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, Cambridge, 2002.
- 2L. Breiman, Mach. Learn. 2001, 45, 5–32.
- 3V. Svetnik, A. Liaw, C. Tong, T. Wang, in Multiple Classifier Systems, Lecture Notes in Computer Science, Vol. 3077 (Eds: ), Springer, Heidelberg, 2004, pp. 334–343.
- 4V. V. Zernov, K. V. Balakin, A. A. Ivaschenko, N. P. Savchuk, I. V. Pletnev, J. Chem. Inf. Comp. Sci. 2003, 43, 2048–2056.
- 5T. Hou, J. Wang, Y. Li, J. Chem. Inf. Model. 2007, 47, 2408–2415.
- 6H. M. Ashtawy, N. R. Mahapatrat, in Proc. 2011 IEEE Intern. Conf. Bioinformatics Biomedicine, 2011, pp. 627–630.
10.1109/BIBM.2011.128 Google Scholar
- 7J. Bajorath, Nat. Rev. Drug Discov. 2002, 1, 882–894.
- 8D. J. Hand, Stat. Sci. 2006, 21, 1–14.
- 9R. Todeschini, V. Consonni, Handbook of Molecular Descriptors, Wiley-VCH, Berlin, 2002.
- 10N. Huang, B. K. Shoichet, J. J. Irwin, J. Med. Chem. 2006, 49, 6789–6801; see also: DUD: A Directory of Useful Decoys, http://dud.docking.org.
- 11F. R. Burden, J. Chem. Inf. Comp. Sci. 1989, 29, 225–227.
- 12K. Liu, J. Feng, S. S. Young, J. Chem. Inf. Model. 2005, 45, 515–522.
- 13G. Shmueli, Stat. Sci. 2010, 25, 289–310.
- 14M. Zhu, Am. Stat. 2008, 62, 97–109.
- 15L. Breiman, Mach. Learn. 1996, 24, 123–140.
- 16J. H. Friedman, T. J. Hastie, R. J. Tibshirani, Ann. Stat. 2000, 28, 337–407.
- 17Y. Freund, R. Iyer, R. E. Schapire, Y. Singer, J. Mach. Learn. Res. 2003, 4, 933–969.
- 18M. Zhu, W. Su, H. A. Chipman, Technometrics 2006, 48, 193–205.
- 19M. Pepe, The Statistical Evaluation of Medical Tests for Classification and Prediction, Oxford University Press, Oxford, 2003.
- 20W. Su, M. Zhu, H. A. Chipman, “Precision-recall Curve or ROC Curve?”, Presentation at the 40th Annual Meeting of the Statistical Society of Canada, Guelph, Ontario, Canada, June 5th, 2012.
- 21J. M. Hughes-Oliver, A. D. Brooks, W. J. Welch, M. G. Khaledi, D. Hawkins, S. S. Young, K. Patil, G. W. Howell, R. T. Ng, M. T. Chu, In Silico Biol. 2011/2012, 11, 61–81.