期刊
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE
卷 12, 期 2, 页码 -出版社
WILEY
DOI: 10.1002/wcms.1567
关键词
artificial intelligence; feature engineering; machine learning; protein-ligand interaction; scoring functions
资金
- Changsha Municipal Natural Science Foundation [kq2014144]
- Changsha Science and Technology Bureau project [kq2001034]
- HKBU Strategic Development Fund project [SDF19-0402-P02]
- Key R&D Program of Zhejiang Province [2020C03010]
- Leading Talent of Ten Thousand Plan-National High-Level Talents Special Support Plan
- National Natural Science Foundation of China [21575128, 81773632]
- Zhejiang Provincial Natural Science Foundation of China [LZ19H300001]
Classical scoring functions have reached a plateau in predictive performance, while machine learning scoring functions relying on sophisticated techniques show great potential in binding affinity prediction. Automated-extraction features are emerging as a new trend in featurization for protein-ligand interactions, helping capture important physical processes.
The predictive performance of classical scoring functions (SFs) seems to have reached a plateau. Currently, SFs relying on sophisticated machine learning techniques have shown great potential in binding affinity prediction and virtual screening. As one of the most indispensable components in the workflow of training a machine learning scoring function (MLSF), the featurization or representation process enables us to catch certain physical processes that are important for protein-ligand interactions and to obtain machine-readable descriptors. Currently, according to how they are derived, the descriptors used in MLSFs for both continuous and binary binding affinity estimates can be grouped into two broad categories: handcrafted features and automated-extraction features. Moreover, the automated-extraction features emerge as a new featurization trend along with the application of deep learning algorithms. Here, we make a thorough summary of the advances in the featurization strategies for protein-ligand interactions in the context of MLSFs, with emphasis on the recently rising automated-extraction features. We also discuss the similarity between protein-ligand interaction representations and small-molecule representations, and the challenges confronted by the scientific community in characterizing protein-ligand interactions. We expect that this review could inspire the development of novel featurization approaches and boosted MLSFs. This article is categorized under: Data Science > Artificial Intelligence/Machine Learning Software > Molecular Modeling Molecular and Statistical Mechanics > Molecular Interactions
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据