期刊
JOURNAL OF CHEMICAL INFORMATION AND MODELING
卷 61, 期 6, 页码 2572-2581出版社
AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.0c01328
关键词
-
类别
资金
- XtalPi Inc
- Postdoc Program at AstraZeneca
This study introduces a novel metric based on chemical space coverage for evaluating and comparing the performance of deep molecular generative models. Experimental results show significant performance variations among different generative models when using limited training data, allowing for clear differentiation of models with stronger generalization capabilities.
In recent years, deep molecular generative models have emerged as promising methods for de novo molecular design. Thanks to the rapid advance of deep learning techniques, deep learning architectures such as recurrent neural networks, variational autoencoders, and adversarial networks have been successfully employed for constructing generative models. Recently, quite a few metrics have been proposed to evaluate these deep generative models. However, many of these metrics cannot evaluate the chemical space coverage of sampled molecules. This work presents a novel and complementary metric for evaluating deep molecular generative models. The metric is based on the chemical space coverage of a reference dataset. GDB-13. The performance of seven different molecular generative models was compared by calculating what fraction of the structures, ring systems, and functional groups could be reproduced from the largely unseen reference set when using only a small fraction of GDB-13 for training. The results show that the performance of the generative models studied varies significantly using the benchmark metrics introduced herein, such that the generalization capabilities of the generative models can be clearly differentiated. In addition, the coverages of GDB-13 ring systems and functional groups were compared between the models. Our study provides a useful new metric that can be used for evaluating and comparing generative models.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据