4.5 Article

Strategies for imputing missing covariates in accelerated failure time models

Journal

STATISTICS IN MEDICINE
Volume 37, Issue 24, Pages 3417-3436

Publisher

WILEY
DOI: 10.1002/sim.7809

Keywords

conditional modeling framework; general location model; Gibbs sampling; interaction; log-normal distribution

Funding

  1. National Institution of Health [R01HL127491]
  2. U.S. Department of Health and Human Services [HHSN268201600018C, HHSN268201600001C, HHSN268201600002C, HHSN268201600003C, HHSN268201600004C]
  3. National Heart, Lung, and Blood Institute

Ask authors/readers for more resources

Missing covariates often occur in biomedical studies with survival outcomes. Multiple imputation via chained equations (MICE) is a semi-parametric and flexible approach that imputes multivariate data by a series of conditional models, one for each incomplete variable. When applying MICE, practitioners tend to specify the conditional models in a simple fashion largely dictated by the software, which could lead to suboptimal results. Practical guidelines for specifying appropriate conditional models in MICE are lacking. Motivated by a study of time to hip fractures in the Women's Health Initiative Observational Study using accelerated failure time models, we propose and experiment with some rationales leading to appropriate MICE specifications. This strategy starts with specifying a joint model for the variables involved. We first derive the conditional distribution of each variable under the joint model, then approximate these conditional distributions to the extent which can be characterized by commonly used regression models. We propose to fit separate models to impute incomplete variables by the failure status, which is key to generating appropriate MICE specifications for survival outcomes. The proposed strategy can be conveniently implemented with all available imputation software that uses fully conditional specifications. Our simulation results show that some commonly used simple MICE specifications can produce suboptimal results, while those based on the proposed strategy appear to perform well and be robust toward model misspecifications. Hence, we warn against a mechanical use of MICE and suggest careful modeling of the conditional distributions of variables to ensure proper performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available