4.5 Article

Imputing missing covariate values for the Cox model

期刊

STATISTICS IN MEDICINE
卷 28, 期 15, 页码 1982-1998

出版社

JOHN WILEY & SONS LTD
DOI: 10.1002/sim.3618

关键词

missing data; missing covariates; multiple imputation; proportional hazards model

资金

  1. MRC [U.1052.00.006]
  2. Medical Research Council [MC_U105260558, MC_EX_G0800814] Funding Source: researchfish
  3. MRC [MC_EX_G0800814, MC_U105260558] Funding Source: UKRI

向作者/读者索取更多资源

Multiple imputation is commonly used to impute missing data, and is typically more efficient than complete cases analysis in regression analysis when covariates have missing values. Imputation may be performed using a regression model for the incomplete covariates on other covariates and, importantly, on the outcome. With a survival outcome, it is a common practice to use the event indicator D and the log of the observed event or censoring time T in the imputation model, but the rationale is not clear. We assume that the survival outcome follows a proportional hazards model given covariates X and Z. We show that a suitable model for imputing binary or Normal X is a logistic or linear regression on the event indicator D, the cumulative baseline hazard H(0)(T), and the other covariates Z. This result is exact in the case of a single binary covariate; in other cases, it is approximately valid for small covariate effects and/or small cumulative incidence. If we do not know H(0)(T), we approximate it by the Nelson-Aalen estimator of H(T) or estimate it by Cox regression. We compare the methods using simulation studies. We find that using logT biases covariate-outcome associations towards the null, while the new methods have lower bias. Overall, we recommend including the event indicator and the Nelson-Aalen estimator of H(T) in the imputation model. Copyright (0 2009 John Wiley & Sons, Ltd.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据