Binding Activity Classification of Anti-SARS-CoV-2 Molecules using Deep Learning Across Multiple Assays

dc.contributor.authorYamasan, Bilge Eren
dc.contributor.authorKorkmaz, Selcuk
dc.date.accessioned2024-06-12T11:19:11Z
dc.date.available2024-06-12T11:19:11Z
dc.date.issued2024
dc.departmentTrakya Üniversitesien_US
dc.description.abstractBackground: The coronavirus disease -2019 (COVID-19) pandemic, caused by severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2), has urgently necessitated effective therapeutic solutions, with a focus on rapidly identifying and classifying potential small -molecule drugs. Given traditional methods' labor-intensive and time-consuming nature, deep learning has emerged as an essential tool for efficiently processing and extracting insights from complex biological data. Aims: To utilize deep learning techniques, particularly deep neural networks (DNN) enhanced with the synthetic minority oversampling technique (SMOTE), to enhance the classification of binding activities in anti-SARS-CoV-2 molecules across various bioassays. Methods: We used 11 bioassay datasets covering various SARS-CoV-2 interactions and inhibitory mechanisms. These assays ranged from spike-ACE2 protein -protein interaction to ACE2 enzymatic activity and 3CL enzymatic activity. To address the prevalent class imbalance in these datasets, the SMOTE technique was employed to generate new samples for the minority class. In our model -building approach, we divided the dataset into 80% training and 20% test sets, reserving 10% of the training set for validation. Our approach involved employing a DNN that integrates ReLU and sigmoid activation functions, incorporates batch normalization, and uses Adam optimization. The hyperparameters and architecture of the DNN were optimized through various tests on layers, minibatch sizes, epoch sizes, and learning rates. A 40% dropout rate was incorporated to mitigate overfitting. For model evaluation, we computed performance metrics, such as balanced accuracy (BACC), precision, recall, F1 score, Matthews' correlation coefficient (MCC), and area under the curve (AUC). Results: The performance of the DNN across 11 bioassay test sets revealed varying outcomes, significantly influenced by the ratios of active -to -inactive compounds. Assays, such as AlphaLISA and CoV-PPE, demonstrated robust performance across various metrics, including BACC, precision, recall, and AUC, when configured with more balanced ratios (1:3 and 1:1, respectively). This suggests the effective identification of active compounds in both cases. In contrast, assays with higher imbalance ratios, such as 3CL (1:38) and cytopathic effect (1:15), demonstrated higher recall but lower precision, highlighting challenges in accurately identifying active compounds among numerous inactive compounds. However, even in these challenging settings, the model achieved favorable BACC and recall scores. Overall, the DNN model generally performed well, as indicated by the BACC, MCC, and AUC values, especially when considering the degree of dataset imbalance in each assay. Conclusion: This study demonstrates the significant impact of deep learning, particularly DNN models enhanced with SMOTE, in improving the identification of active compounds in bioassay datasets for COVID-19 drug discovery, outperforming traditional machine learning models. Furthermore, this study highlights the efficacy of advanced computational techniques in addressing high -throughput screening data imbalances.en_US
dc.identifier.doi10.4274/balkanmedj.galenos.2024.2024-1-73
dc.identifier.endpage192en_US
dc.identifier.issn2146-3123
dc.identifier.issn2146-3131
dc.identifier.issue3en_US
dc.identifier.pmid38462979en_US
dc.identifier.scopus2-s2.0-85192113384en_US
dc.identifier.scopusqualityQ3en_US
dc.identifier.startpage186en_US
dc.identifier.urihttps://doi.org/10.4274/balkanmedj.galenos.2024.2024-1-73
dc.identifier.urihttps://hdl.handle.net/20.500.14551/25096
dc.identifier.volume41en_US
dc.identifier.wosWOS:001215913000012en_US
dc.identifier.wosqualityN/Aen_US
dc.indekslendigikaynakWeb of Scienceen_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakPubMeden_US
dc.language.isoenen_US
dc.publisherGalenos Publ Houseen_US
dc.relation.ispartofBalkan Medical Journalen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectInhibitorsen_US
dc.titleBinding Activity Classification of Anti-SARS-CoV-2 Molecules using Deep Learning Across Multiple Assaysen_US
dc.typeArticleen_US

Dosyalar