Volume 46, Issue 1 e27514
RESEARCH ARTICLE

Pre-training strategy for antiviral drug screening with low-data graph neural network: A case study in HIV-1 K103N reverse transcriptase

Kajjana Boonpalit

Kajjana Boonpalit

School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong, Thailand

Search for more papers by this author
Hathaichanok Chuntakaruk

Hathaichanok Chuntakaruk

Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok, Thailand

Center of Excellence in Structural and Computational Biology, Faculty of Science, Chulalongkorn University, Bangkok, Thailand

Center for Artificial Intelligence in Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Bangkok, Thailand

Search for more papers by this author
Jiramet Kinchagawat

Jiramet Kinchagawat

School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong, Thailand

CARIVA (Thailand) Company Ltd, Bangkok, Thailand

Search for more papers by this author
Peter Wolschann

Peter Wolschann

Department of Theoretical Chemistry, University of Vienna, Vienna, Austria

Search for more papers by this author
Supot Hannongbua

Supot Hannongbua

Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok, Thailand

Center of Excellence in Computational Chemistry (CECC), Department of Chemistry, Faculty of Science, Chulalongkorn University, Bangkok, Thailand

Search for more papers by this author
Thanyada Rungrotmongkol

Corresponding Author

Thanyada Rungrotmongkol

Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok, Thailand

Center of Excellence in Structural and Computational Biology, Faculty of Science, Chulalongkorn University, Bangkok, Thailand

Correspondence

Thanyada Rungrotmongkol, Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, 10330, Bangkok, Thailand.

Email: [email protected]

Sarana Nutanong, School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), 21210, Rayong, Thailand.

Email: [email protected]

Search for more papers by this author
Sarana Nutanong

Corresponding Author

Sarana Nutanong

School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong, Thailand

Correspondence

Thanyada Rungrotmongkol, Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, 10330, Bangkok, Thailand.

Email: [email protected]

Sarana Nutanong, School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), 21210, Rayong, Thailand.

Email: [email protected]

Search for more papers by this author
First published: 22 October 2024
Citations: 3

Kajjana Boonpalit and Hathaichanok Chuntakaruk contributed equally to this study.

Abstract

Graph neural networks (GNN) offer an alternative approach to boost the screening effectiveness in drug discovery. However, their efficacy is often hindered by limited datasets. To address this limitation, we introduced a robust GNN training framework, applied to various chemical databases to identify potent non-nucleoside reverse transcriptase inhibitors (NNRTIs) against the challenging K103N-mutated HIV-1 RT. Leveraging self-supervised learning (SSL) pre-training to tackle data scarcity, we screened 1,824,367 compounds, using multi-step approach that incorporated machine learning (ML)-based screening, analysis of absorption, distribution, metabolism, and excretion (ADME) prediction, drug-likeness properties, and molecular docking. Ultimately, 45 compounds were left as potential candidates with 17 of the compounds were previously identified as NNRTIs, exemplifying the model's efficacy. The remaining 28 compounds are anticipated to be repurposed for new uses. Molecular dynamics (MD) simulations on repurposed candidates unveiled two promising preclinical drugs: one designed against Plasmodium falciparum and the other serving as an antibacterial agent. Both have superior binding affinity compared to anti-HIV drugs. This conceptual framework could be adapted for other disease-specific therapeutics, facilitating the identification of potent compounds effective against both WT and mutants while revealing novel scaffolds for drug design and discovery.

CONFLICT OF INTEREST STATEMENT

The authors declare no conflicts of interest.

DATA AVAILABILITY STATEMENT

The GIN models, pre-training dataset, and downstream dataset used in this work are available in our GitHub repository at https://github.com/kajjana/HIV-SSL. The code accompanying this work is taken from GitHub repository of Ref. 11 (https://github.com/snap-stanford/pretrain-gnns) and 29 (https://github.com/yuyangw/MolCLR).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.