期刊
PROTEIN AND PEPTIDE LETTERS
卷 17, 期 1, 页码 32-37出版社
BENTHAM SCIENCE PUBL LTD
DOI: 10.2174/092986610789909494
关键词
Subcellular localization, PSSM; PseAAC, Linear dimensionality reduction, PCA, LDA
资金
- National Natural Science Foundation of China [60704047, 60805001]
- Shanghai Pujiang Program
With the rapid increase of protein sequences in the post-genomic age, the need for an automated and accurate tool to predict protein subcellular localization becomes increasingly important. Many efforts have been tried. Most of them aim to find the optimal classification scheme and less of them take the simplifying the complexity of biological system into consideration. This work shows how to decrease the complexity of biological system with linear DR (Dimensionality Reduction) method by transforming the original high-dimensional feature vectors into the low-dimensional feature vectors. A powerful sequence encoding scheme by fusing PSSM (Position-Specific Score Matrix) and Chou's PseAA (Pseudo Amino Acid) composition is proposed to represent the protein samples. Then, the K-NN (K-Nearest Neighbor) classifier is employed to identify the subcellular localization based on their reduced low-dimensional feature vectors. Experimental results thus obtained are quite encouraging, indicating that the aforementioned linear DR method is quite promising in dealing with complicated biological problems, such as predicting the subcellular localization of Gram-negative bacterial proteins.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据