☆ 4.5 Article

Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models

COMPUTATIONAL BIOLOGY AND CHEMISTRY (2017)

期刊

COMPUTATIONAL BIOLOGY AND CHEMISTRY

卷 69, 期 -, 页码 110-119

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.compbiolchem.2017.05.007

关键词

Virtual screening; Machine learning; QSAR; PubChem; Logistic regression; Signature descriptor

类别

Biology Computer Science, Interdisciplinary Applications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The readily available high throughput screening (HIS) data from the PubChem database provides an opportunity for mining of small molecules in a variety of biological systems using machine learning techniques. From the thousands of available molecular descriptors developed to encode useful chemical information representing the characteristics of molecules, descriptor selection is an essential step in building an optimal quantitative structural-activity relationship (QSAR) model. For the development of a systematic descriptor selection strategy, we need the understanding of the relationship between: (i) the descriptor selection; (ii) the choice of the machine learning model; and (iii) the characteristics of the target bio-molecule. In this work, we employed the Signature descriptor to generate a dataset on the Human kallikrein 5 (hK 5) inhibition confirmatory assay data and compared multiple classification models including logistic regression, support vector machine, random forest and k-nearest neighbor. Under optimal conditions, the logistic regression model provided extremely high overall accuracy (98%) and precision (90%), with good sensitivity (65%) in the cross validation test. In testing the primary HTS screening data with more than 200 K molecular structures, the logistic regression model exhibited the capability of eliminating more than 99.9% of the inactive structures. As part of our exploration of the descriptor-model-target relationship, the excellent predictive performance of the combination of the Signature descriptor and the logistic regression model on the assay data of the Human kallikrein 5 (hK 5) target suggested a feasible descriptor/model selection strategy on similar targets. (C) 2017 Elsevier Ltd. All rights reserved.

Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models

期刊

COMPUTATIONAL BIOLOGY AND CHEMISTRY

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models

期刊

COMPUTATIONAL BIOLOGY AND CHEMISTRY

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文