4.7 Article

Machine learning methods to predict the crystallization propensity of small organic molecules

Journal

CRYSTENGCOMM
Volume 22, Issue 16, Pages 2817-2826

Publisher

ROYAL SOC CHEMISTRY
DOI: 10.1039/d0ce00070a

Keywords

-

Funding

  1. Fundacao para a Ciencia e Tecnologia (FCT) Portugal [UID/QUI/50006/2019]
  2. Fundacao para a Ciencia e a Tecnologia, MCTES [DL 57/2016]

Ask authors/readers for more resources

Machine learning (ML) algorithms were explored for the prediction of the crystallization propensity based on molecular descriptors and fingerprints generated from 2D chemical structures and 3D molecular descriptors from 3D chemical structures optimized with empirical methods. In total, 57815 molecules were retrieved from the Reaxys (R) database, from those 53 998 molecules are recorded as crystalline (class A), 3097 as polymorphic (class B), and 720 as amorphous (class C). A training data set with 40 462 organic molecules was used to build the models, which were validated with an external test set comprising 17353 organic molecules. Several ML algorithms such as random forest (RF), support vector machines (SVM), and deep learning multilayer perceptron networks (MLP) were screened. The best performance was achieved with a consensus classification model obtained by RF, SVM, and MLP models, which predicted the external test set with an overall predictive accuracy (Q) of up to 80%.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available