4.7 Article

Bastion6: a bioinformatics approach for accurate prediction of type VI secreted effectors

期刊

BIOINFORMATICS
卷 34, 期 15, 页码 2546-2555

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty155

关键词

-

资金

  1. National Health and Medical Research Council of Australia (NHMRC) [1092262]
  2. Australian Research Council (ARC)
  3. National Institute of Allergy and Infectious Diseases of the National Institutes of Health [R01 AI111965]
  4. Natural Science Foundation of Guangxi [2016GXNSFCA380005]
  5. NATIONAL INSTITUTE OF ALLERGY AND INFECTIOUS DISEASES [R01AI111965] Funding Source: NIH RePORTER

向作者/读者索取更多资源

Motivation: Many Gram-negative bacteria use type VI secretion systems (T6SS) to export effector proteins into adjacent target cells. These secreted effectors (T6SEs) play vital roles in the competitive survival in bacterial populations, as well as pathogenesis of bacteria. Although various computational analyses have been previously applied to identify effectors secreted by certain bacterial species, there is no universal method available to accurately predict T6SS effector proteins from the growing tide of bacterial genome sequence data. Results: We extracted a wide range of features from T6SE protein sequences and comprehensively analyzed the prediction performance of these features through unsupervised and supervised learning. By integrating these features, we subsequently developed a two-layer SVM-based ensemble model with fine-grain optimized parameters, to identify potential T6SEs. We further validated the predictive model using an independent dataset, which showed that the proposed model achieved an impressive performance in terms of ACC (0.943), F-value (0.946), MCC (0.892) and AUC (0.976). To demonstrate applicability, we employed this method to correctly identify two very recently validated T6SE proteins, which represent challenging prediction targets because they significantly differed from previously known T6SEs in terms of their sequence similarity and cellular function. Furthermore, a genome-wide prediction across 12 bacterial species, involving in total 54 212 protein sequences, was carried out to distinguish 94 putative T6SE candidates. We envisage both this information and our publicly accessible web server will facilitate future discoveries of novel T6SEs.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据