☆ 4.7 Review

Using molecular embeddings in QSAR modeling: does it make a difference?

BRIEFINGS IN BIOINFORMATICS (2022)

期刊

BRIEFINGS IN BIOINFORMATICS

卷 23, 期 1, 页码 -

出版社

OXFORD UNIV PRESS

DOI: 10.1093/bib/bbab365

关键词

molecular representations; QSAR modeling; cheminformatics; embeddings; deep learning

类别

Biochemical Research Methods Mathematical & Computational Biology

资金

National Scientific and Technical Research Council (CONICET) (Argentina) [PIP 112-2017-0100829]
National Agency of Scientific and Technological Promotion (ANPCyT) (Argentina) [PICT-2019-03350]
Universidad Nacional del Sur (Argentina) [PGI 24/N042]
Natural Sciences and Engineering Research Council (NSERC) Discovery grant (Canada)
Google Latin America Research Award 2020-2021
DeepSense
ACENET
Calcul Quebec
Compute Canada

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

With the consolidation of deep learning in drug discovery, several novel algorithms for learning molecular representations have been proposed. However, comparing different molecular embeddings and traditional representations is not straightforward, hindering the process of choosing suitable representations for QSAR modeling. The study conducted experiments comparing different embedding techniques and found that the predictive performance using molecular embeddings did not significantly surpass that of traditional representations.

With the consolidation of deep learning in drug discovery, several novel algorithms for learning molecular representations have been proposed. Despite the interest of the community in developing new methods for learning molecular embeddings and their theoretical benefits, comparing molecular embeddings with each other and with traditional representations is not straightforward, which in turn hinders the process of choosing a suitable representation for Quantitative Structure-Activity Relationship (QSAR) modeling. A reason behind this issue is the difficulty of conducting a fair and thorough comparison of the different existing embedding approaches, which requires numerous experiments on various datasets and training scenarios. To close this gap, we reviewed the literature on methods for molecular embeddings and reproduced three unsupervised and two supervised molecular embedding techniques recently proposed in the literature. We compared these five methods concerning their performance in QSAR scenarios using different classification and regression datasets. We also compared these representations to traditional molecular representations, namely molecular descriptors and fingerprints. As opposed to the expected outcome, our experimental setup consisting of over 25 000 trained models and statistical tests revealed that the predictive performance using molecular embeddings did not significantly surpass that of traditional representations. Although supervised embeddings yielded competitive results compared with those using traditional molecular representations, unsupervised embeddings tended to perform worse than traditional representations. Our results highlight the need for conducting a careful comparison and analysis of the different embedding techniques prior to using them in drug design tasks and motivate a discussion about the potential of molecular embeddings in computer-aided drug design.

Using molecular embeddings in QSAR modeling: does it make a difference?

期刊

BRIEFINGS IN BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Using molecular embeddings in QSAR modeling: does it make a difference?

期刊

BRIEFINGS IN BIOINFORMATICS

出版社

OXFORD UNIV PRESS

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文