4.5 Article

Boosting multivariate structured additive distributional regression models

期刊

STATISTICS IN MEDICINE
卷 -, 期 -, 页码 -

出版社

WILEY
DOI: 10.1002/sim.9699

关键词

generalized additive models for location; scale and shape; model-based boosting; multivariate Gaussian distribution; multivariate logit model; multivariate Poisson distribution; semiparametric regression

向作者/读者索取更多资源

Within the framework of generalized additive models, we have developed a model-based boosting approach for multivariate distributional regression, which allows for simultaneous modeling of all distribution parameters of a multivariate response conditional on explanatory variables. It is applicable to potentially high-dimensional data and incorporates data-driven variable selection. The approach also enables modeling the association between multiple continuous or discrete outcomes through relevant covariates.
We develop a model-based boosting approach for multivariate distributional regression within the framework of generalized additive models for location, scale, and shape. Our approach enables the simultaneous modeling of all distribution parameters of an arbitrary parametric distribution of a multivariate response conditional on explanatory variables, while being applicable to potentially high-dimensional data. Moreover, the boosting algorithm incorporates data-driven variable selection, taking various different types of effects into account. As a special merit of our approach, it allows for modeling the association between multiple continuous or discrete outcomes through the relevant covariates. After a detailed simulation study investigating estimation and prediction performance, we demonstrate the full flexibility of our approach in three diverse biomedical applications. The first is based on high-dimensional genomic cohort data from the UK Biobank, considering a bivariate binary response (chronic ischemic heart disease and high cholesterol). Here, we are able to identify genetic variants that are informative for the association between cholesterol and heart disease. The second application considers the demand for health care in Australia with the number of consultations and the number of prescribed medications as a bivariate count response. The third application analyses two dimensions of childhood undernutrition in Nigeria as a bivariate response and we find that the correlation between the two undernutrition scores is considerably different depending on the child's age and the region the child lives in.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据