Journal
BIOINFORMATICS
Volume 34, Issue 14, Pages 2371-2375Publisher
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty113
Keywords
-
Ask authors/readers for more resources
Motivation: The 16S ribosomal RNA (rRNA) gene is widely used to survey microbial communities. Sequences are often clustered into Operational Taxonomic Units (OTUs) as proxies for species. The canonical clustering threshold is 97% identity, which was proposed in 1994 when few 16S rRNA sequences were available, motivating a reassessment on current data. Results: Using a large set of high-quality 16S rRNA sequences from finished genomes, I assessed the correspondence of OTUs to species for five representative clustering algorithms using four accuracy metrics. All algorithms had comparable accuracy when tuned to a given metric. Optimal identity thresholds were similar to 99% for full-length sequences and similar to 100% for the V4 hypervariable region.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available