4.6 Article

Missing data matter: an empirical evaluation of the impacts of missing EHR data in comparative effectiveness research

期刊

出版社

OXFORD UNIV PRESS
DOI: 10.1093/jamia/ocad066

关键词

electronic health records; empirical study; missing data; multiple imputation

向作者/读者索取更多资源

This study aimed to quantify the impacts of missing data in comparative effectiveness research (CER) using electronic health records (EHRs) and compare the performance of different imputation methods. Results showed that the spline smoothing method produced results close to those without missing data when the missing data depended on the stochastic progression of disease and medical practice patterns. Compared to multiple imputation, spline smoothing generally performed similarly or better, with smaller estimation bias and less power loss. Therefore, leveraging the temporal information of disease trajectory to impute missing values and considering the missing rate and effect size when choosing an imputation method are important when using EHRs for CER.
Objectives: The impacts of missing data in comparative effectiveness research (CER) using electronic health records (EHRs) may vary depending on the type and pattern of missing data. In this study, we aimed to quantify these impacts and compare the performance of different imputation methods. Materials and Methods: We conducted an empirical (simulation) study to quantify the bias and power loss in estimating treatment effects in CER using EHR data. We considered various missing scenarios and used the propensity scores to control for confounding. We compared the performance of the multiple imputation and spline smoothing methods to handle missing data. Results: When missing data depended on the stochastic progression of disease and medical practice patterns, the spline smoothing method produced results that were close to those obtained when there were no missing data. Compared to multiple imputation, the spline smoothing generally performed similarly or better, with smaller estimation bias and less power loss. The multiple imputation can still reduce study bias and power loss in some restrictive scenarios, eg, when missing data did not depend on the stochastic process of disease progression. Discussion and Conclusion: Missing data in EHRs could lead to biased estimates of treatment effects and false negative findings in CER even after missing data were imputed. It is important to leverage the temporal information of disease trajectory to impute missing values when using EHRs as a data resource for CER and to consider the missing rate and the effect size when choosing an imputation method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据