4.8 Article

Predicting reaction conditions from limited data through active transfer learning

期刊

CHEMICAL SCIENCE
卷 13, 期 22, 页码 6655-6668

出版社

ROYAL SOC CHEMISTRY
DOI: 10.1039/d1sc06932b

关键词

-

资金

  1. NSF [NIH-R35GM128830]
  2. University of Michigan College of Pharmacy [IIS-2007055]

向作者/读者索取更多资源

This article demonstrates the use of transfer learning and active learning to accelerate the development of new chemical reactions. Specifically tuned machine learning models based on random forest classifiers are used to expand the applicability of Pd-catalyzed cross-coupling reactions to new types of nucleophiles. The results show that model transfer is effective even when trained on relatively small amounts of data. Additionally, a model simplification scheme and an active transfer learning strategy are introduced to improve the predictive capability of the models.
Transfer and active learning have the potential to accelerate the development of new chemical reactions, using prior data and new experiments to inform models that adapt to the target area of interest. This article shows how specifically tuned machine learning models, based on random forest classifiers, can expand the applicability of Pd-catalyzed cross-coupling reactions to types of nucleophiles unknown to the model. First, model transfer is shown to be effective when reaction mechanisms and substrates are closely related, even when models are trained on relatively small numbers of data points. Then, a model simplification scheme is tested and found to provide comparative predictivity on reactions of new nucleophiles that include unseen reagent combinations. Lastly, for a challenging target where model transfer only provides a modest benefit over random selection, an active transfer learning strategy is introduced to improve model predictions. Simple models, composed of a small number of decision trees with limited depths, are crucial for securing generalizability, interpretability, and performance of active transfer learning.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据