4.7 Article

Uniting Cheminformatics and Chemical Theory To Predict the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules

期刊

出版社

AMER CHEMICAL SOC
DOI: 10.1021/ci4005805

关键词

-

资金

  1. Biotechnology and Biological Sciences Research Council (BBSRC) [BB/I00596X/1]
  2. Scottish Funding Council (SFC)
  3. Scottish Universities Life Sciences Alliance (SULSA)
  4. Scottish Overseas Research Student Awards Scheme of the Scottish Funding Council (SFC)
  5. Biotechnology and Biological Sciences Research Council [BB/I00596X/1] Funding Source: researchfish
  6. BBSRC [BB/I00596X/1] Funding Source: UKRI

向作者/读者索取更多资源

We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure property relationship (QSPR) models. We also develop machine learning models where the theoretical energies and cheminformatics descriptors are used as combined input. These models are used to predict solvation free energy. While direct theoretical calculation does not give accurate results in this approach, machine learning is able to give predictions with a root mean squared error (RMSE) of similar to 1.1 log S units in a 10-fold cross-validation for our Drug-Like-Solubility 100 (DLS-100) dataset of 100 druglike molecules. We find that a model built using energy terms from our theoretical methodology as descriptors is marginally less predictive than one built on Chemistry Development Kit (CDK) descriptors. Combining both sets of descriptors allows a further but very modest improvement in the predictions. However, in some cases, this is a statistically significant enhancement. These results suggest that there is little complementarity between the chemical information provided by these two sets of descriptors, despite their different sources and methods of calculation. Our machine learning models are also able to predict the well-known Solubility Challenge dataset with an RMSE value of 0.9-1.0 log S units.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据