☆ 4.6 Article

Logistic regression vs. predictive mean matching for imputing binary covariates

STATISTICAL METHODS IN MEDICAL RESEARCH (2023)

期刊

STATISTICAL METHODS IN MEDICAL RESEARCH

卷 -, 期 -, 页码 -

出版社

SAGE PUBLICATIONS LTD

DOI: 10.1177/09622802231198795

关键词

Missing data; multiple imputation; Monte Carlo simulations

类别

Health Care Sciences & Services Mathematical & Computational Biology Medical Informatics Statistics & Probability

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this study, the statistical performance of predictive mean matching and logistic regression for imputing missing binary variables was compared through Monte Carlo simulations. The results showed that the two methods had virtually identical statistical performance when the analysis model was a logistic regression model.

Multivariate imputation using chained equations (MICE) is a popular algorithm for imputing missing data that entails specifying multivariate models through conditional distributions. For imputing missing continuous variables, two common imputation methods are the use of parametric imputation using a linear model and predictive mean matching. When imputing missing binary variables, the default approach is parametric imputation using a logistic regression model. In the R implementation of MICE, the use of predictive mean matching can be substantially faster than using logistic regression as the imputation model for missing binary variables. However, there is a paucity of research into the statistical performance of predictive mean matching for imputing missing binary variables. Our objective was to compare the statistical performance of predictive mean matching with that of logistic regression for imputing missing binary variables. Monte Carlo simulations were used to compare the statistical performance of predictive mean matching with that of logistic regression for imputing missing binary outcomes when the analysis model of scientific interest was a multivariable logistic regression model. We varied the size of the analysis samples (N = 250, 500, 1,000, 5,000, and 10,000) and the prevalence of missing data (5%-50% in increments of 5%). In general, the statistical performance of predictive mean matching was virtually identical to that of logistic regression for imputing missing binary variables when the analysis model was a logistic regression model. This was true across a wide range of scenarios defined by sample size and the prevalence of missing data. In conclusion, predictive mean matching can be used to impute missing binary variables. The use of predictive mean matching to impute missing binary variables can result in a substantial reduction in computer processing time when conducting simulations of multiple imputation.

Logistic regression vs. predictive mean matching for imputing binary covariates

期刊

STATISTICAL METHODS IN MEDICAL RESEARCH

出版社

SAGE PUBLICATIONS LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Logistic regression vs. predictive mean matching for imputing binary covariates

期刊

STATISTICAL METHODS IN MEDICAL RESEARCH

出版社

SAGE PUBLICATIONS LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文