4.7 Article

Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein-Ligand Scoring Functions

期刊

JOURNAL OF CHEMICAL INFORMATION AND MODELING
卷 62, 期 11, 页码 2696-2712

出版社

AMER CHEMICAL SOC
DOI: 10.1021/acs.jcim.2c00485

关键词

-

资金

  1. U.S. National Institutes of Health [R35-GM127040]

向作者/读者索取更多资源

In this study, the robustness and applicability of machine-learning scoring functions were further improved by expanding the training set, developing meaningful features, using a linear empirical scoring function as the baseline, and applying extreme gradient boosting (XGBoost) with Delta-machine learning. The new scoring function demonstrated superior performance in scoring and ranking in various structure types and showed reliability and robustness in virtual screening applications.
Protein-ligand scoring functions are widely used in structure-based drug design for fast evaluation of protein-ligand interactions, and it is of strong interest to develop scoring functions with machine-learning approaches. In this work, by expanding the training set, developing physically meaningful features, employing our recently developed linear empirical scoring function Lin_F9 (Yang, C. et al. J. Chem. Inf. Model. 2021, 61, 4630-4644) as the baseline, and applying extreme gradient boosting (XGBoost) with Delta-machine learning, we have further improved the robustness and applicability of machine-learning scoring functions. Besides the top performances for scoring-ranking-screening power tests of the CASF-2016 benchmark, the new scoring function Delta(Lin_F9)XGB also achieves superior scoring and ranking performances in different structure types that mimic real docking applications. The scoring powers of Delta(Lin_F9)XGB for locally optimized poses, flexible redocked poses, and ensemble docked poses of the CASF-2016 core set achieve Pearson's correlation coefficient (R) values of 0.853, 0.839, and 0.813, respectively. In addition, the large-scale docking-based virtual screening test on the LIT-PCBA data set demonstrates the reliability and robustness of Delta(Lin_F9)XGB in virtual screening application. The Delta(Lin_F9)XGB scoring function and its code are freely available on the web at (https://yzhang.hpc.nyu.edu/Delta_LinF9_XGB).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据