☆ 4.4 Article

Three approaches to supervised learning for compositional data with pairwise logratios

JOURNAL OF APPLIED STATISTICS (2023)

期刊

JOURNAL OF APPLIED STATISTICS

卷 50, 期 16, 页码 3272-3293

出版社

TAYLOR & FRANCIS LTD

DOI: 10.1080/02664763.2022.2108007

关键词

Compositional data; logratios; generalized linear modelling; variable selection; stepwise regression

类别

Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article presents three alternative stepwise supervised learning methods to select pairwise logratios that best explain a dependent variable in a generalized linear model. The first method allows unrestricted search, leading to the most accurate predictions. The second method restricts each part to occur only once, making the corresponding logratios intuitively interpretable. The third method uses additive logratios, involving a K-part subcomposition in the selected logratios.

Logratios between pairs of compositional parts (pairwise logratios) are the easiest to interpret in compositional data analysis, and include the well-known additive logratios as particular cases. When the number of parts is large (sometimes even larger than the number of cases), some form of logratio selection is needed. In this article, we present three alternative stepwise supervised learning methods to select the pairwise logratios that best explain a dependent variable in a generalized linear model, each geared for a specific problem. The first method features unrestricted search, where any pairwise logratio can be selected. This method has a complex interpretation if some pairs of parts in the logratios overlap, but it leads to the most accurate predictions. The second method restricts parts to occur only once, which makes the corresponding logratios intuitively interpretable. The third method uses additive logratios, so that K-1 selected logratios involve a K-part subcomposition. Our approach allows logratios or non-compositional covariates to be forced into the models based on theoretical knowledge, and various stopping criteria are available based on information measures or statistical significance with the Bonferroni correction. We present an application on a dataset from a study predicting Crohn's disease.

Three approaches to supervised learning for compositional data with pairwise logratios

期刊

JOURNAL OF APPLIED STATISTICS

出版社

TAYLOR & FRANCIS LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Three approaches to supervised learning for compositional data with pairwise logratios

期刊

JOURNAL OF APPLIED STATISTICS

出版社

TAYLOR & FRANCIS LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文