4.6 Article

Molecular Descriptors, Structure Generation, and Inverse QSAR/QSPR Based on SELFIES

期刊

ACS OMEGA
卷 8, 期 24, 页码 21781-21786

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acsomega.3c01332

关键词

-

向作者/读者索取更多资源

This paper proposes a method based on SELFIES for molecular descriptors, structure generation, and inverse QSAR/QSPR. By converting SELFIES into SELFIES descriptors x, an inverse analysis of the QSAR/QSPR model y = f(x) is conducted, obtaining x values that achieve the target y value and successfully generating SELFIES strings or molecules.
For inverse QSAR/QSPR in conventional molecular design,severalchemical structures must be generated and their molecular descriptorsmust be calculated. However, there is no one-to-one correspondencebetween the generated chemical structures and molecular descriptors.In this paper, molecular descriptors, structure generation, and inverseQSAR/QSPR based on self-referencing embedded strings (SELFIES), a100% robust molecular string representation, are proposed. A one-hotvector is converted from SELFIES to SELFIES descriptors x, and an inverse analysis of the QSAR/QSPR model y = f(x) with the objective variable y and molecular descriptor x is conducted.Thus, x values that achieve a target y value are obtained. Based on these values, SELFIES strings or moleculesare generated, meaning that inverse QSAR/QSPR is performed successfully.The SELFIES descriptors and SELFIES-based structure generation areverified using datasets of actual compounds. The successful constructionof SELFIES-descriptor-based QSAR/QSPR models with predictive abilitiescomparable to those of models based on other fingerprints is confirmed.A large number of molecules with one-to-one relationships with thevalues of the SELFIES descriptors are generated. Furthermore, as acase study of inverse QSAR/QSPR, molecules with target y values are generated successfully. The Python code for the proposedmethod is available at https://github.com/hkaneko1985/dcekit.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据