☆ 4.5 Article

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases

STATISTICS IN MEDICINE (2018)

期刊

STATISTICS IN MEDICINE

卷 37, 期 23, 页码 3309-3324

出版社

WILEY

DOI: 10.1002/sim.7820

关键词

health care databases; heterogeneous treatment effects; machine learning; propensity score; simulation

类别

Mathematical & Computational Biology Public, Environmental & Occupational Health Medical Informatics Medicine, Research & Experimental Statistics & Probability

资金

National Health and Medical Research Council [1125414]
National Health and Medical Research Council of Australia [1125414] Funding Source: NHMRC

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

There is growing interest in using routinely collected data from health care databases to study the safety and effectiveness of therapies in real-world conditions, as it can provide complementary evidence to that of randomized controlled trials. Causal inference from health care databases is challenging because the data are typically noisy, high dimensional, and most importantly, observational. It requires methods that can estimate heterogeneous treatment effects while controlling for confounding in high dimensions. Bayesian additive regression trees, causal forests, causal boosting, and causal multivariate adaptive regression splines are off-the-shelf methods that have shown good performance for estimation of heterogeneous treatment effects in observational studies of continuous outcomes. However, it is not clear how these methods would perform in health care database studies where outcomes are often binary and rare and data structures are complex. In this study, we evaluate these methods in simulation studies that recapitulate key characteristics of comparative effectiveness studies. We focus on the conditional average effect of a binary treatment on a binary outcome using the conditional risk difference as an estimand. To emulate health care database studies, we propose a simulation design where real covariate and treatment assignment data are used and only outcomes are simulated based on nonparametric models of the real outcomes. We apply this design to 4 published observational studies that used records from 2 major health care databases in the United States. Our results suggest that Bayesian additive regression trees and causal boosting consistently provide low bias in conditional risk difference estimates in the context of health care database studies.

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases

期刊

STATISTICS IN MEDICINE

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases

期刊

STATISTICS IN MEDICINE

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文