4.1 Article

MSLVP: prediction of multiple subcellular localization of viral proteins using a support vector machine

期刊

MOLECULAR BIOSYSTEMS
卷 12, 期 8, 页码 2572-2586

出版社

ROYAL SOC CHEMISTRY
DOI: 10.1039/c6mb00241b

关键词

-

资金

  1. Department of Biotechnology, Government of India [GAP0001]
  2. Council of Scientific and Industrial Research (CSIR) [BSC0121]

向作者/读者索取更多资源

Knowledge of the subcellular location (SCL) of viral proteins in the host cell is important for understanding their function in depth. Therefore, we have developed MSLVP'', a two-tier prediction algorithm for predicting multiple SCLs of viral proteins. For this study, data sets of comprehensive viral proteins with experimentally validated SCL annotation were collected from UniProt. Non-redundant (90%) data sets of 3480 viral proteins that belonged to single (2715), double (391) and multiple (374) sites were employed. Additionally, 1687 (30% sequence identity) viral proteins were categorised into single (1366), double (167) and multiple (154) sites. Single, double and multiple locations further comprised of eight, four and six categories, respectively. Viral protein locations include the nucleus, cytoplasm, endoplasmic reticulum, extracellular, single-pass membrane, multi-pass membrane, capsid, remaining others and combinations thereof. Support vector machine based models were developed using sequence features like amino acid composition, dipeptide composition, physicochemical properties and their hybrids. We have employed one-versus-one'' as well as one-versus-other'' strategies for multiclass classification. The performance of one-versus-one'' is better than the one-versus-other'' approach during 10-fold cross-validation. For the 90% data set, we achieved an accuracy, a Matthew's correlation coefficient (MCC) and a receiver operating characteristic (ROC) of 99.99%, 1.00, 1.00; 100.00%, 1.00, 1.00 and 99.90%; 1.00, 1.00 for single, double and multiple locations, respectively. Similar results were achieved for a 30% sequence identity data set. Predictive models for each SCL performed equally well on the independent dataset. The MSLVP web server (http://bioinfo.imtech.res.in/manojk/mslvpred/) can predict subcellular locations i.e. single (8; including single and multi-pass membrane), double (4) and multiple (6). This would be helpful for elucidating the functional annotation of viral proteins and potential drug targets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.1
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据