4.4 Article

TESTING FOR DIFFERENTIAL ABUNDANCE IN COMPOSITIONAL COUNTS DATA, WITH APPLICATION TO MICROBIOME STUDIES

期刊

ANNALS OF APPLIED STATISTICS
卷 16, 期 4, 页码 2648-2671

出版社

INST MATHEMATICAL STATISTICS-IMS
DOI: 10.1214/22-AOAS1607

关键词

Compositional bias; analysis of composition; normalization; rarefaction; nonparamet-ric tests

资金

  1. ISF
  2. [1049/16]

向作者/读者索取更多资源

Identifying the microbiota taxa associated with traits of interest is crucial for advancing science and health. However, this task is challenging due to the compositional nature of the taxa counts and the sparsity of the data. This study focuses on Crohn's disease and shows that existing methods may produce a high number of false positives when identifying differentially abundant taxa. A novel nonparametric approach is introduced, which provides valid inference even with a substantial fraction of zero counts.
Identifying which taxa in our microbiota are associated with traits of in-terest is important for advancing science and health. However, the identifica-tion is challenging because the measured vector of taxa counts (by amplicon sequencing) is compositional, so a change in the abundance of one taxon in the microbiota induces a change in the number of sequenced counts across all taxa. The data are typically sparse, with many zero counts present either due to biological variance or limited sequencing depth. We examine the case of Crohn's disease, where the microbial load changes substantially with the disease. For this representative example of a highly compositional setting, we show existing methods designed to identify differentially abundant taxa may have an inflated number of false positives. We introduce a novel nonpara-metric approach that provides valid inference, even when the fraction of zero counts is substantial. Our approach uses a set of reference taxa that are non -differentially abundant which can be estimated from the data or from outside information. Our approach also allows for a novel type of testing: multivariate tests of differential abundance over a focused subset of the taxa. Genera-level multivariate testing discovers additional genera as differentially abundant by avoiding agglomeration of taxa.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据