4.6 Article

A Bayesian method for identifying associations between response variables and bacterial community composition

Journal

PLOS COMPUTATIONAL BIOLOGY
Volume 18, Issue 7, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pcbi.1010108

Keywords

-

Ask authors/readers for more resources

We developed a Bayesian regression model (BRACoD) to determine associations between intestinal bacteria and physiological measurements. The algorithm corrects for the compositional nature of the data and provides a list of associations. Simulation experiments showed that adopting a cut point value of >= 0.3 optimized the true positive rate.
Determining associations between intestinal bacteria and continuously measured physiological outcomes is important for understanding the bacteria-host relationship but is not straightforward since abundance data (compositional data) are not normally distributed. To address this issue, we developed a fully Bayesian linear regression model (BRACoD; Bayesian Regression Analysis of Compositional Data) with physiological measurements (continuous data) as a function of a matrix of relative bacterial abundances. Bacteria can be classified as operational taxonomic units or by taxonomy (genus, family, etc.). Bacteria associated with the physiological measurement were identified using a Bayesian variable selection method: Stochastic Search Variable Selection. The output is a list of inclusion probabilities ((p) over cap) and coefficients that indicate the strength of the association ((beta) over cap (included)) for each bacterial taxa. Tests with simulated communities showed that adopting a cut point value of (p) over cap >= 0.3 for identifying included bacteria optimized the true positive rate (TPR) while maintaining a false positive rate (FPR) of <= 5%. At this point, the chances of identifying non-contributing bacteria were low and all well-established contributors were included. Comparison with other methods showed that BRACoD (at (p)over cap> >= 0.3) had higher precision and a higher TPR than a commonly used center log transformed LASSO procedure (clr-LASSO) as well as higher TPR than an off-the-shelf Spike and Slab method after center log transformation (clr-SS). BRACoD was also less likely to include non-contributing bacteria that merely correlate with contributing bacteria. Analysis of a rat microbiome experiment identified 47 operational taxonomic units that contributed to fecal butyrate levels. Of these, 31 were positively and 16 negatively associated with butyrate. Consistent with their known role in butyrate metabolism, most of these fell within the Lachnospiraceae and Ruminococcaceae. We conclude that BRACoD provides a more precise and accurate method for determining bacteria associated with a continuous physiological outcome compared to clr-LASSO. It is more sensitive than a generalized clr-SS algorithm, although it has a higher FPR. Its ability to distinguish genuine contributors from correlated bacteria makes it better suited to discriminating bacteria that directly contribute to an outcome. The algorithm corrects for the distortions arising from compositional data making it appropriate for analysis of microbiome data. We present a fully Bayesian linear regression model to identify associations between physiological measurements (continuous data) and intestinal bacteria (relative bacterial abundances; compositional data). The BRACoD (Bayesian Regression Analysis of Compositional Data) algorithm corrects for the compositional nature of the bacterial data to provide a list of inclusion probabilities and regression coefficients that indicate the strength of the association. If desired, the user can specify a cut point to select only the bacteria that meet predetermined performance characteristics. Analysis of a simulated dataset based on data from a rat microbiome study indicated that an inclusion probability cut point value >= 0.3 minimized the false positive rate while maintaining a reasonably high sensitivity (true positive rate). The identified associations form a starting point for generating hypotheses about the relationship between the gut microbial community and physiological outcomes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available