4.2 Article

Relevance of Machine Learning to Predict the Inhibitory Activity of Small Thiazole Chemicals on Estrogen Receptor

期刊

CURRENT COMPUTER-AIDED DRUG DESIGN
卷 19, 期 1, 页码 37-50

出版社

BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/1573409919666221121141646

关键词

QSAR; python; supervised machine learning; thiazole derivatives; MCF-7; breast cancer; cytotoxicity

向作者/读者索取更多资源

In this study, a new open-source data analysis Python script was used to discover lead compounds for anticancer drugs by building a QSAR model using 53 thiazole derivatives. Machine learning approaches were employed, and the performance of the model was evaluated using three different algorithms.
Background: Drug discovery requires the use of hybrid technologies for the discovery of new chemical substances. One of those interesting strategies is QSAR via applying an artificial intelligence system that effectively predicts how chemical alterations can impact biological activity via in-silico. Aim: Our present study aimed to work on a trending machine learning approach with a new open-source data analysis python script for the discovery of anticancer lead via building the QSAR model by using 53 compounds of thiazole derivatives. Methods: A python script has been executed with 53 small thiazole chemicals using Google collaboratory interface. A total of 82 CDK molecular descriptors were downloaded from chemdes web server and used for our study. After training the model, we checked the model performance via cross-validation of the external test set. Results: The generated QSAR model afforded the ordinary least squares (OLS) regression as R-2 = 0.542, F=8.773, and adjusted R-2 (Q2) =0.481, std. error = 0.061, reg.coef_ developed were of, -0.00064 (PC1), -0.07753 (PC2), -0.09078 (PC3), -0.08986 (PC4), 0.05044 (PC5), and reg.intercept_ of 4.79279 developed through stats models, formula module. The performance of test set prediction was done by multiple linear regression, support vector machine, and partial least square regression classifiers of sklearn module, which generated the model score of 0.5424, 0.6422 and 0.6422 respectively. Conclusion: Hence, we conclude that the R2values (i.e. the model score) obtained using this script via three diverse algorithms were correlated well and there is not much difference between them and may be useful in the design of a similar group of thiazole derivatives as anticancer agents.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据