4.4 Article

Rapid annotation of nifH gene sequences using classification and regression trees facilitates environmental functional gene analysis

期刊

ENVIRONMENTAL MICROBIOLOGY REPORTS
卷 8, 期 5, 页码 905-916

出版社

WILEY
DOI: 10.1111/1758-2229.12455

关键词

-

资金

  1. Gordon and Betty Moore Foundation Marine Investigator award
  2. National Science Foundation Science Center for Microbial Oceanography Research and Education (C-MORE) [EF-0424599]

向作者/读者索取更多资源

The nifH gene is a widely used molecular proxy for studying nitrogen fixation. Phylogenetic classification of nifH gene sequences is an essential step in diazotroph community analysis that requires a fast automated solution due to increasing size of environmental sequence libraries and increasing yield of nifH sequences from high-throughput technologies. A novel approach to rapidly classify nifH amino acid sequences into well-defined phylogenetic clusters that provides a common platform for comparative analysis across studies is presented. Phylogenetic group membership can be accurately predicted with decision tree-type statistical models that identify and utilize signature residues in the amino acid sequences. Our classification models were trained and evaluated with a publicly available and manually curated nifH gene database containing cluster annotations. Model-independent sequence sets from diverse ecosystems were used for further assessment of the models' prediction accuracy. The utility of this novel sequence binning approach was demonstrated in a comparative study where joint treatment of diazotroph assemblages from a wide range of habitats identified habitat-specific and widely-distributed diazotrophs and revealed a marine - terrestrial distinction in community composition. Our rapid and automated phylogenetic cluster assignment circumvents extensive phylogenetic analysis of nifH sequences; hence, it saves substantial time and resources in nitrogen fixation studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据