Multiple Instance Learning Based on Mol2vec Molecular Substructure Embeddings for Discovery of NDM-1 Inhibitors - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Conference Papers Year : 2022

Multiple Instance Learning Based on Mol2vec Molecular Substructure Embeddings for Discovery of NDM-1 Inhibitors

Abstract

In this paper, we first present a new dataset of NDM-1 biological activities that is compiled by a cleaned version of the NMDI database. A literature review enriched the former database by 741 new compounds, comprising activities against NDM-1 classified in three classes (inactive, weakly and strongly active compounds) by specifying a unifying procedure for the labeling, which covers a range of different activity properties. Second, we restate the classification problem in the Multiple Instance Learning (MIL) setting by representing the compounds as a collection of Mol2vec vectors, each of them corresponding to a specific substructure (either atom or atom including their first neighbors). We observe an amelioration up to 45.7% and 38.47% in respect to balanced accuracy and F1-score, respectively, for the strongly active class in the MIL approach when compared to the classical Machine Learning paradigm. Finally, we present a classification and ranking framework based on classifiers learned by a k-fold CV procedure, which possess different hyper-parameters per fold, learnt by a Bayes optimization procedure. We observe that the top-3 and top-5 ranked accuracies of the strongly active classified compounds yield 100% for the MIL setting.
Fichier principal
Vignette du fichier
PACBB22_Papastergiou_preprint.pdf (257.16 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

lirmm-03859902 , version 1 (18-11-2022)

Identifiers

Cite

Thomas Papastergiou, Jérôme Azé, Sandra Bringay, Maxime Louet, Pascal Poncelet, et al.. Multiple Instance Learning Based on Mol2vec Molecular Substructure Embeddings for Discovery of NDM-1 Inhibitors. PACBB 2022 - 16th International Conference on Practical Applications of Computational Biology and Bioinformatics, Jul 2022, L'Aquila, Italy. pp.55-66, ⟨10.1007/978-3-031-17024-9_6⟩. ⟨lirmm-03859902⟩
36 View
171 Download

Altmetric

Share

More