4.6 Article

Actively Searching: Inverse Design of Novel Molecules with Simultaneously Optimized Properties

Journal

JOURNAL OF PHYSICAL CHEMISTRY A
Volume 126, Issue 2, Pages 333-340

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/acs.jpca.1c08191

Keywords

-

Funding

  1. Electro-chemical Systems Program [2045887-CBET]
  2. Dreyfus Program for Machine Learning in the Chemical Sciences and Engineering

Ask authors/readers for more resources

Combining quantum chemistry characterizations with generative machine learning models can accelerate molecular discovery. An active learning approach is used to improve the performance of multi-target generative chemical models by gradually refining the model's understanding of targeted areas of chemical space. This approach does not require modifications to existing generative approaches and can be applied to optimize multiple properties simultaneously.
Combining quantum chemistry characterizations with generative machine learning models has the potential to accelerate molecular discovery. In this paradigm, quantum chemistry acts as a relatively cost-effective oracle for evaluating the properties of particular molecules, while generative models provide a means of sampling chemical space based on learned structure- function relationships. For practical applications, multiple potentially orthogonal properties must be optimized in tandem during a discovery workflow. This carries additional difficulties associated with the specificity of the targets and the ability for the model to reconcile all properties simultaneously. Here, we demonstrate an active learning approach to improve the performance of multi-target generative chemical models. We first demonstrate the effectiveness of a set of baseline models trained on single property prediction tasks in generating novel compounds (i.e., not present in the training data) with various property targets, including both interpolative and extrapolative generation scenarios. For property ranges where accurate targeting proves difficult, the novel compounds suggested by the model are characterized using quantum chemistry and the new molecules closest to expressing the desired properties are fed back into the generative model for additional training. This gradually improves the generative models' understanding of targeted areas of chemical space and shifts the distribution of the generated compounds toward the targeted values. We then demonstrate the effectiveness of this active learning approach in generating compounds with multiple chemical constraints, including vertical ionization potential, electron affinity, and dipole moment targets, and validate the results at the omega B97X-D3/def2-TZVP level. This method requires no modifications to extant generative approaches, but rather utilizes their inherent generative and predictive aspects for self-refinement, and can be applied to situations where any number of properties with varying degrees of correlation must be optimized simultaneously.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available