期刊
PROTEIN AND PEPTIDE LETTERS
卷 19, 期 4, 页码 375-387出版社
BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/092986612799789369
关键词
Amphiphilic pseudo amino acid composition; ensemble classifier; gene ontology; jackknife test; k-nearest neighbor; multiple subcellular localization; support vector machine
资金
- National Natural Science Foundation of China [30901512, 31100953]
- Shanghai Leading Academic Discipline Project [S30405]
Many proteins bear multi-locational characteristics, and this phenomenon is closely related to biological function. However, most of the existing methods can only deal with single-location proteins. Therefore, an automatic and reliable ensemble classifier for protein subcellular multi-localization is needed. We propose a new ensemble classifier combining the KNN (K-nearest neighbour) and SVM (support vector machine) algorithms to predict the subcellular localization of eukaryotic, Gram-negative bacterial and viral proteins based on the general form of Chou's pseudo amino acid composition, i.e., GO (gene ontology) annotations, dipeptide composition and AmPseAAC (Amphiphilic pseudo amino acid composition). This ensemble classifier was developed by fusing many basic individual classifiers through a voting system. The overall prediction accuracies obtained by the KNN-SVM ensemble classifier are 95.22, 93.47 and 80.72% for the eukaryotic, Gram-negative bacterial and viral proteins, respectively. Our prediction accuracies are significantly higher than those by previous methods and reveal that our strategy better predicts subcellular locations of multi-location proteins.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据