4.7 Article

Comparative Study of Deep Generative Models on Chemical Space Coverage

Journal

JOURNAL OF CHEMICAL INFORMATION AND MODELING
Volume 61, Issue 6, Pages 2572-2581

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.0c01328

Keywords

-

Funding

  1. XtalPi Inc
  2. Postdoc Program at AstraZeneca

Ask authors/readers for more resources

This study introduces a novel metric based on chemical space coverage for evaluating and comparing the performance of deep molecular generative models. Experimental results show significant performance variations among different generative models when using limited training data, allowing for clear differentiation of models with stronger generalization capabilities.
In recent years, deep molecular generative models have emerged as promising methods for de novo molecular design. Thanks to the rapid advance of deep learning techniques, deep learning architectures such as recurrent neural networks, variational autoencoders, and adversarial networks have been successfully employed for constructing generative models. Recently, quite a few metrics have been proposed to evaluate these deep generative models. However, many of these metrics cannot evaluate the chemical space coverage of sampled molecules. This work presents a novel and complementary metric for evaluating deep molecular generative models. The metric is based on the chemical space coverage of a reference dataset. GDB-13. The performance of seven different molecular generative models was compared by calculating what fraction of the structures, ring systems, and functional groups could be reproduced from the largely unseen reference set when using only a small fraction of GDB-13 for training. The results show that the performance of the generative models studied varies significantly using the benchmark metrics introduced herein, such that the generalization capabilities of the generative models can be clearly differentiated. In addition, the coverages of GDB-13 ring systems and functional groups were compared between the models. Our study provides a useful new metric that can be used for evaluating and comparing generative models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available