4.7 Article

BEExact: a Metataxonomic Database Tool for High-Resolution Inference of Bee-Associated Microbial Communities

Journal

MSYSTEMS
Volume 6, Issue 2, Pages -

Publisher

AMER SOC MICROBIOLOGY
DOI: 10.1128/mSystems.00082-21

Keywords

microbiota; bees; 16S rRNA gene sequencing; microbial ecology; bioinformatics; host-microbe interactions; polymicrobial communities; microbial; phylogenetics; endosymbionts; environmental microbiology; invertebrate-microbe; interactions; microbial communities; taxonomy; metataxonomics

Categories

Funding

  1. W. Garfield Weston Foundation of Canada
  2. Natural Sciences and Engineering Research Council of Canada (NSERC)
  3. Ontario Ministry of Agriculture, Food and Rural Affairs (OMAFRA)

Ask authors/readers for more resources

High-throughput 16S rRNA gene sequencing technologies have the potential to enhance understanding of bee-associated microbial communities. A new database, BEExact, was developed to allow accurate and standardized microbiota profiling. Data-driven recommendations were formulated to improve the utility and ecological relevance of routine 16S rRNA gene-based sequencing endeavors.
High-throughput 16S rRNA gene sequencing technologies have robust potential to improve our understanding of bee (Hymenoptera: Apoidea)-associated microbial communities and their impact on hive health and disease. Despite recent computation algorithms now permitting exact inferencing of high-resolution exact amplicon sequence variants (ASVs), the taxonomic classification of these ASVs remains a challenge due to inadequate reference databases. To address this, we assemble a comprehensive data set of all publicly available bee-associated 16S rRNA gene sequences, systematically annotate poorly resolved identities via inclusion of 618 placeholder labels for uncultivated microbial dark matter, and correct for phylogenetic inconsistencies using a complementary set of distance-based and maximum likelihood correction strategies. To benchmark the resultant database (BEExact), we compare performance against all existing reference databases in silico using a variety of classifier algorithms to produce probabilistic confidence scores. We also validate realistic classification rates on an independent set of similar to 234 million short-read sequences derived from 32 studies encompassing 50 different bee types (36 eusocial and 14 solitary). Species-level classification rates on short-read ASVs range from 80 to 90% using BEExact (with similar to 20% due to bxid placeholder names), whereas only similar to 30% at best can be resolved with current universal databases. A series of data-driven recommendations are developed for future studies. We conclude that BEExact (https://github.com/bdaisley/BEExact) enables accurate and standardized microbiota profiling across a broad range of bee species-two factors of key importance to reproducibility and meaningful knowledge exchange within the scientific community that together, can enhance the overall utility and ecological relevance of routine 16S rRNA gene-based sequencing endeavors. IMPORTANCE The failure of current universal taxonomic databases to support the rapidly expanding field of bee microbiota research has led to many investigators relying on in-house reference sets or manual classification of sequence reads (usually based on BLAST searches), often with vague identity thresholds and subjective taxonomy choices. This time-consuming, error-and bias-prone process lacks standardization, cripples the potential for comparative cross-study analysis, and in many cases is likely to incorrectly sway study conclusions. BEExact is structured on and leverages several complementary bioinformatic techniques to enable refined inference of bee host-associated microbial communities without any other methodological modifications necessary. It also bridges the gap between current practical outcomes (i.e., phylotype-to-genus level constraints with 97% operational taxonomic units [OTUs]) and the theoretical resolution (i.e., species-to-strain level classification with 100% ASVs) attainable in future microbiota investigations. Other niche habitats could also likely benefit from customized database curation via implementation of the novel approaches introduced in this study.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available