4.7 Article

PhyloCorrelate: inferring bacterial gene-gene functional associations through large-scale phylogenetic profiling

期刊

BIOINFORMATICS
卷 37, 期 1, 页码 17-22

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btaa1105

关键词

-

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC) [RGPIN-201904266, RGPAS-2019-00004]

向作者/读者索取更多资源

PhyloCorrelate is a computational framework for gene co-occurrence analysis that combines various co-occurrence metrics to optimize the analysis of gene associations in large phylogenomic databases, enabling gene function prediction.
Motivation: Statistical detection of co-occurring genes across genomes, known as 'phylogenetic profiling', is a powerful bioinformatic technique for inferring gene-gene functional associations. However, this can be a challenging task given the size and complexity of phylogenomic databases, difficulty in accounting for phylogenetic structure, inconsistencies in genome annotation and substantial computational requirements. Results: We introduce PhyloCorrelate-a computational framework for gene co-occurrence analysis across large phylogenomic datasets. PhyloCorrelate implements a variety of co-occurrence metrics including standard correlation metrics and model-based metrics that account for phylogenetic history. By combining multiple metrics, we developed an optimized score that exhibits a superior ability to link genes with overlapping GO terms and KEGG pathways, enabling gene function prediction. Using genomic and functional annotation data from the Genome Taxonomy Database and AnnoTree, we performed all-by-all comparisons of gene occurrence profiles across the bacterial tree of life, totaling 154 217 052 comparisons for 28 315 genes across 27 372 bacterial genomes. All predictions are available in an online database, which instantaneously returns the top correlated genes for any PFAM, TIGRFAM or KEGG query. In total, PhyloCorrelate detected 29 762 high confidence associations between bacterial gene/protein pairs, and generated functional predictions for 834 DUFs and proteins of unknown function.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据