Journal
MSYSTEMS
Volume 6, Issue 2, Pages -Publisher
AMER SOC MICROBIOLOGY
DOI: 10.1128/mSystems.00082-21
Keywords
microbiota; bees; 16S rRNA gene sequencing; microbial ecology; bioinformatics; host-microbe interactions; polymicrobial communities; microbial; phylogenetics; endosymbionts; environmental microbiology; invertebrate-microbe; interactions; microbial communities; taxonomy; metataxonomics
Categories
Funding
- W. Garfield Weston Foundation of Canada
- Natural Sciences and Engineering Research Council of Canada (NSERC)
- Ontario Ministry of Agriculture, Food and Rural Affairs (OMAFRA)
Ask authors/readers for more resources
High-throughput 16S rRNA gene sequencing technologies have the potential to enhance understanding of bee-associated microbial communities. A new database, BEExact, was developed to allow accurate and standardized microbiota profiling. Data-driven recommendations were formulated to improve the utility and ecological relevance of routine 16S rRNA gene-based sequencing endeavors.
High-throughput 16S rRNA gene sequencing technologies have robust potential to improve our understanding of bee (Hymenoptera: Apoidea)-associated microbial communities and their impact on hive health and disease. Despite recent computation algorithms now permitting exact inferencing of high-resolution exact amplicon sequence variants (ASVs), the taxonomic classification of these ASVs remains a challenge due to inadequate reference databases. To address this, we assemble a comprehensive data set of all publicly available bee-associated 16S rRNA gene sequences, systematically annotate poorly resolved identities via inclusion of 618 placeholder labels for uncultivated microbial dark matter, and correct for phylogenetic inconsistencies using a complementary set of distance-based and maximum likelihood correction strategies. To benchmark the resultant database (BEExact), we compare performance against all existing reference databases in silico using a variety of classifier algorithms to produce probabilistic confidence scores. We also validate realistic classification rates on an independent set of similar to 234 million short-read sequences derived from 32 studies encompassing 50 different bee types (36 eusocial and 14 solitary). Species-level classification rates on short-read ASVs range from 80 to 90% using BEExact (with similar to 20% due to bxid placeholder names), whereas only similar to 30% at best can be resolved with current universal databases. A series of data-driven recommendations are developed for future studies. We conclude that BEExact (https://github.com/bdaisley/BEExact) enables accurate and standardized microbiota profiling across a broad range of bee species-two factors of key importance to reproducibility and meaningful knowledge exchange within the scientific community that together, can enhance the overall utility and ecological relevance of routine 16S rRNA gene-based sequencing endeavors. IMPORTANCE The failure of current universal taxonomic databases to support the rapidly expanding field of bee microbiota research has led to many investigators relying on in-house reference sets or manual classification of sequence reads (usually based on BLAST searches), often with vague identity thresholds and subjective taxonomy choices. This time-consuming, error-and bias-prone process lacks standardization, cripples the potential for comparative cross-study analysis, and in many cases is likely to incorrectly sway study conclusions. BEExact is structured on and leverages several complementary bioinformatic techniques to enable refined inference of bee host-associated microbial communities without any other methodological modifications necessary. It also bridges the gap between current practical outcomes (i.e., phylotype-to-genus level constraints with 97% operational taxonomic units [OTUs]) and the theoretical resolution (i.e., species-to-strain level classification with 100% ASVs) attainable in future microbiota investigations. Other niche habitats could also likely benefit from customized database curation via implementation of the novel approaches introduced in this study.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available