4.6 Article

Random forests for high-dimensional longitudinal data

期刊

STATISTICAL METHODS IN MEDICAL RESEARCH
卷 30, 期 1, 页码 166-184

出版社

SAGE PUBLICATIONS LTD
DOI: 10.1177/0962280220946080

关键词

Stochastic mixed effects model; tree-based methods; high-dimensional data; repeated measurements

向作者/读者索取更多资源

Random forests are a state-of-the-art supervised machine learning method that performs well in high-dimensional settings. By introducing a new method that considers intra-individual covariance, the approach achieved good and consistent results in an HIV vaccine trial.
Random forests are one of the state-of-the-art supervised machine learning methods and achieve good performance in high-dimensional settings wherep, the number of predictors, is much larger thann, the number of observations. Repeated measurements provide, in general, additional information, hence they are worth accounted especially when analyzing high-dimensional data. Tree-based methods have already been adapted to clustered and longitudinal data by using a semi-parametric mixed effects model, in which the non-parametric part is estimated using regression trees or random forests. We propose a general approach of random forests for high-dimensional longitudinal data. It includes a flexible stochastic model which allows the covariance structure to vary over time. Furthermore, we introduce a new method which takes intra-individual covariance into consideration to build random forests. Through simulation experiments, we then study the behavior of different estimation methods, especially in the context of high-dimensional data. Finally, the proposed method has been applied to an HIV vaccine trial including 17 HIV-infected patients with 10 repeated measurements of 20,000 gene transcripts and blood concentration of human immunodeficiency virus RNA. The approach selected 21 gene transcripts for which the association with HIV viral load was fully relevant and consistent with results observed during primary infection.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据