☆ 4.4 Article

Causal Inference with Multilevel Data: A Comparison of Different Propensity Score Weighting Approaches

MULTIVARIATE BEHAVIORAL RESEARCH (2022)

Journal

MULTIVARIATE BEHAVIORAL RESEARCH

Volume 57, Issue 6, Pages 916-939

Publisher

ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD

DOI: 10.1080/00273171.2021.1925521

Keywords

Causal inference; propensity scores; multilevel data; weighting; calibration weights

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Propensity score methods are widely recommended to adjust for confounding and recover treatment effects. This article reviews propensity score weighting estimators for multilevel data and shows that estimates based on calibration weights should be preferred under many scenarios. Large cluster sizes are needed for accurate estimates of treatment effect when covariate effects vary strongly across clusters.

Propensity score methods are a widely recommended approach to adjust for confounding and to recover treatment effects with non-experimental, single-level data. This article reviews propensity score weighting estimators for multilevel data in which individuals (level 1) are nested in clusters (level 2) and nonrandomly assigned to either a treatment or control condition at level 1. We address the choice of a weighting strategy (inverse probability weights, trimming, overlap weights, calibration weights) and discuss key issues related to the specification of the propensity score model (fixed-effects model, multilevel random-effects model) in the context of multilevel data. In three simulation studies, we show that estimates based on calibration weights, which prioritize balancing the sample distribution of level-1 and (unmeasured) level-2 covariates, should be preferred under many scenarios (i.e., treatment effect heterogeneity, presence of strong level-2 confounding) and can accommodate covariate-by-cluster interactions. However, when level-1 covariate effects vary strongly across clusters (i.e., under random slopes), and this variation is present in both the treatment and outcome data-generating mechanisms, large cluster sizes are needed to obtain accurate estimates of the treatment effect. We also discuss the implementation of survey weights and present a real-data example that illustrates the different methods.

Causal Inference with Multilevel Data: A Comparison of Different Propensity Score Weighting Approaches

Journal

MULTIVARIATE BEHAVIORAL RESEARCH

Publisher

ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Causal Inference with Multilevel Data: A Comparison of Different Propensity Score Weighting Approaches

Journal

MULTIVARIATE BEHAVIORAL RESEARCH

Publisher

ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper