4.5 Article

Comparing methods for estimation of heterogeneous treatment effects using observational data from health care databases

期刊

STATISTICS IN MEDICINE
卷 37, 期 23, 页码 3309-3324

出版社

WILEY
DOI: 10.1002/sim.7820

关键词

health care databases; heterogeneous treatment effects; machine learning; propensity score; simulation

资金

  1. National Health and Medical Research Council [1125414]
  2. National Health and Medical Research Council of Australia [1125414] Funding Source: NHMRC

向作者/读者索取更多资源

There is growing interest in using routinely collected data from health care databases to study the safety and effectiveness of therapies in real-world conditions, as it can provide complementary evidence to that of randomized controlled trials. Causal inference from health care databases is challenging because the data are typically noisy, high dimensional, and most importantly, observational. It requires methods that can estimate heterogeneous treatment effects while controlling for confounding in high dimensions. Bayesian additive regression trees, causal forests, causal boosting, and causal multivariate adaptive regression splines are off-the-shelf methods that have shown good performance for estimation of heterogeneous treatment effects in observational studies of continuous outcomes. However, it is not clear how these methods would perform in health care database studies where outcomes are often binary and rare and data structures are complex. In this study, we evaluate these methods in simulation studies that recapitulate key characteristics of comparative effectiveness studies. We focus on the conditional average effect of a binary treatment on a binary outcome using the conditional risk difference as an estimand. To emulate health care database studies, we propose a simulation design where real covariate and treatment assignment data are used and only outcomes are simulated based on nonparametric models of the real outcomes. We apply this design to 4 published observational studies that used records from 2 major health care databases in the United States. Our results suggest that Bayesian additive regression trees and causal boosting consistently provide low bias in conditional risk difference estimates in the context of health care database studies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据