4.7 Article

Identification of subtypes of anticancer peptides based on sequential features and physicochemical properties

期刊

SCIENTIFIC REPORTS
卷 11, 期 1, 页码 -

出版社

NATURE PORTFOLIO
DOI: 10.1038/s41598-021-93124-9

关键词

-

资金

  1. Ministry of Science and Technology [MOST109-2320-B-195-001]
  2. Hsinchu Mackay Memorial Hospital

向作者/读者索取更多资源

A two-step machine learning model was developed to effectively identify subtypes of anticancer peptides by examining amino acid composition and physicochemical properties. The model demonstrated high accuracy and specificity in cross-validation and independent testing, indicating its predictive power. The hybrid feature sets considered in the training process contributed to the model's effectiveness in discriminating between ACPs and non-ACPs.
Anticancer peptides (ACPs) are a kind of bioactive peptides which could be used as a novel type of anticancer drug that has several advantages over chemistry-based drug, including high specificity, strong tumor penetration capacity, and low toxicity to normal cells. As the number of experimentally verified bioactive peptides has increased significantly, various of in silico approaches are imperative for investigating the characteristics of ACPs. However, the lack of methods for investigating the differences in physicochemical properties of ACPs. In this study, we compared the N- and C-terminal amino acid composition for each peptide, there are three major subtypes of ACPs that are defined based on the distribution of positively charged residues. For the first time, we were motivated to develop a two-step machine learning model for identification of the subtypes of ACPs, which classify the input data into the corresponding group before applying the classifier. Further, to improve the predictive power, the hybrid feature sets were considered for prediction. Evaluation by five-fold cross-validation showed that the two-step model trained with sequence-based features and physicochemical properties was most effective in discriminating between ACPs and non-ACPs. The two-step model trained with the hybrid features performed well, with a sensitivity of 86.75%, a specificity of 85.75%, an accuracy of 86.08%, and a Matthews Correlation Coefficient value of 0.703. Furthermore, the model also consistently provides the effective performance in independent testing set, with sensitivity of 77.6%, specificity of 94.74%, accuracy of 88.99% and the MCC value reached 0.75. Finally, the two-step model has been implemented as a web-based tool, namely iDACP, which is now freely available at http://mer.hc.mmh.org.tw/iDACP/.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据