4.4 Article

Advancing the prediction accuracy of protein-protein interactions by utilizing evolutionary information from position-specific scoring matrix and ensemble classifier

期刊

JOURNAL OF THEORETICAL BIOLOGY
卷 418, 期 -, 页码 105-110

出版社

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
DOI: 10.1016/j.jtbi.2017.01.003

关键词

Position-specific scoring matrix; Multiple sequences alignments; Rotation forest; Cancer

资金

  1. National Science Foundation of China [61373086, 61572506, 11301517, 11631014]
  2. Pioneer Hundred Talents Program of Chinese Academy of Sciences
  3. National Key Research and Development Plan [2016YFC0600908]
  4. Graduate Education Innovation project of Jiangsu Province [ICYLX16_0535]

向作者/读者索取更多资源

Protein-Protein Interactions (PPIs) are essential to most biological processes and play a critical role in most cellular functions. With the development of high-throughput biological techniques and in si/ico methods, a large number of PPI data have been generated for various organisms, but many problems remain unsolved. These factors promoted the development of the in silico methods based on machine learning to predict PPIs. In this study, we propose a novel method by combining ensemble Rotation Forest (RF) classifier and Discrete Cosine Transform (DCT) algorithm to predict the interactions among proteins. Specifically, the protein amino acids sequence is transformed into Position-Specific Scoring Matrix (PSSM) containing biological evolution information, and then the feature vector is extracted to present protein evolutionary information using DCT algorithm; finally, the ensemble rotation forest model is used to predict whether a given protein pair is interacting or not. When performed on Yeast and H. pylori data sets, the proposed method achieved excellent results with an average accuracy of 98.54% and 88.27%. In addition, we achieved good prediction accuracy of 98.08%, 92.75%, 98.87% and 98.72% on independent data sets (C.elegans, E.coli, Hsapiens and M.muscu/us). In order to further evaluate the performance of our method, we compare it with the state-of-the-art Support Vector Machine (SVM) classifier and get good results.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据