4.2 Article

Bag of little bootstraps for massive and distributed longitudinal data

期刊

STATISTICAL ANALYSIS AND DATA MINING
卷 15, 期 3, 页码 314-321

出版社

WILEY
DOI: 10.1002/sam.11563

关键词

bags of little bootstraps; big data; EMR; linear mixed models; longitudinal data; parallel and distributed computing

资金

  1. Division of Mathematical Sciences [DMS-2054253]
  2. National Heart, Lung, and Blood Institute [HL150374]
  3. National Human Genome Research Institute [HG006139]
  4. National Institute of Diabetes and Digestive and Kidney Diseases [DK106116]
  5. National Institute of General Medical Sciences [GM141798]

向作者/读者索取更多资源

The study introduces a highly efficient statistical method for analyzing very large longitudinal datasets, showing significant advantages over traditional methods.
Linear mixed models are widely used for analyzing longitudinal datasets, and the inference for variance component parameters relies on the bootstrap method. However, health systems and technology companies routinely generate massive longitudinal datasets that make the traditional bootstrap method infeasible. To solve this problem, we extend the highly scalable bag of little bootstraps method for independent data to longitudinal data and develop a highly efficient Julia package MixedModelsBLB.jl. Simulation experiments and real data analysis demonstrate the favorable statistical performance and computational advantages of our method compared to the traditional bootstrap method. For the statistical inference of variance components, it achieves 200 times speedup on the scale of 1 million subjects (20 million total observations), and is the only currently available tool that can handle more than 10 million subjects (200 million total observations) using desktop computers.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据