4.5 Article

Introduction to computational causal inference using reproducible Stata, R, and Python code: A tutorial

Journal

STATISTICS IN MEDICINE
Volume 41, Issue 2, Pages 407-432

Publisher

WILEY
DOI: 10.1002/sim.9234

Keywords

causal inference; double-robust methods; g-formula; G-methods; inverse probability weighting; machine learning; propensity score; regression adjustment; targeted maximum likelihood estimation

Funding

  1. Cancer Research UK
  2. Instituto de Salud Carlos III

Ask authors/readers for more resources

The main purpose of medical studies is to estimate treatment effects, but sometimes randomization is not possible; observational studies are used in such cases. Challenges in observational studies include confounding, which is typically controlled by adjusting measured confounders; recent advances in causal inference have focused on addressing confounding. However, a lack of computational tutorials has caused some confusion for researchers using these methods.
The main purpose of many medical studies is to estimate the effects of a treatment or exposure on an outcome. However, it is not always possible to randomize the study participants to a particular treatment, therefore observational study designs may be used. There are major challenges with observational studies; one of which is confounding. Controlling for confounding is commonly performed by direct adjustment of measured confounders; although, sometimes this approach is suboptimal due to modeling assumptions and misspecification. Recent advances in the field of causal inference have dealt with confounding by building on classical standardization methods. However, these recent advances have progressed quickly with a relative paucity of computational-oriented applied tutorials contributing to some confusion in the use of these methods among applied researchers. In this tutorial, we show the computational implementation of different causal inference estimators from a historical perspective where new estimators were developed to overcome the limitations of the previous estimators (ie, nonparametric and parametric g-formula, inverse probability weighting, double-robust, and data-adaptive estimators). We illustrate the implementation of different methods using an empirical example from the Connors study based on intensive care medicine, and most importantly, we provide reproducible and commented code in Stata, R, and Python for researchers to adapt in their own observational study. The code can be accessed at .

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available