4.6 Article

Microbial Diversity Biased Estimation Caused by Intragenomic Heterogeneity and Interspecific Conservation of 16S rRNA Genes

期刊

出版社

AMER SOC MICROBIOLOGY
DOI: 10.1128/aem.02108-22

关键词

16S rRNA gene; interspecific conservation; intragenomic heterogeneity; microbial diversity

向作者/读者索取更多资源

The 16S rRNA gene is widely used as a marker to study the evolutionary relationships and microbial composition. However, it has limitations such as variable copy numbers, intragenomic heterogeneity, and low taxonomic resolution, causing biases in estimating microbial diversity. This study analyzed prokaryotic genomes and found that the 16S rRNA gene copy number ranged from 1 to 37 in bacteria and 1 to 5 in archaea, with intragenomic heterogeneity in 60% of genomes. The study also calculated the overestimation and underestimation of microbial diversity using different regions of the 16S rRNA gene.
The 16S rRNA gene has been extensively used as a molecular marker to explore evolutionary relationships and profile microbial composition throughout various environments. Despite its convenience and prevalence, limitations are inevitable. Variable copy numbers, intragenomic heterogeneity, and low taxonomic resolution have caused biases in estimating microbial diversity. Here, analysis of 24,248 complete prokaryotic genomes indicated that the 16S rRNA gene copy number ranged from 1 to 37 in bacteria and 1 to 5 in archaea, and intragenomic heterogeneity was observed in 60% of prokaryotic genomes, most of which were below 1%. The overestimation of microbial diversity caused by intragenomic variation and the underestimation introduced by interspecific conservation were calculated when using full-length or partial 16S rRNA genes. Results showed that, at the 100% threshold, microbial diversity could be overestimated by as much as 156.5% when using the full-length gene. The V4 to V5 region-based analyses introduced the lowest overestimation rate (4.4%) but exhibited slightly lower species resolution than other variable regions under the 97% threshold. For different variable regions, appropriate thresholds rather than the canonical value 97% were proposed for minimizing the risk of splitting a single genome into multiple clusters and lumping together different species into the same cluster. This study has not only updated the 16S rRNA gene copy number and intragenomic variation information for the currently available prokaryotic genomes, but also elucidated the biases in estimating prokaryotic diversity with quantitative data, providing references for choosing amplified regions and clustering thresholds in microbial community surveys.IMPORTANCE Microbial diversity is typically analyzed using marker gene-based methods, of which 16S rRNA gene sequencing is the most widely used approach. However, obtaining an accurate estimation of microbial diversity remains a challenge, due to the intragenomic variation and low taxonomic resolution of 16S rRNA genes. Comprehensive examination of the bias in estimating such prokaryotic diversity using 16S rRNA genes within ever-increasing prokaryotic genomes highlights the importance of the choice of sequencing regions and clustering thresholds based on the specific research objectives. Microbial diversity is typically analyzed using marker gene-based methods, of which 16S rRNA gene sequencing is the most widely used approach. However, obtaining an accurate estimation of microbial diversity remains a challenge, due to the intragenomic variation and low taxonomic resolution of 16S rRNA genes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据