☆ 3.8 Article

Imputation of Missing Data in Electronic Health Records Based on Patients' Similarities

JOURNAL OF HEALTHCARE INFORMATICS RESEARCH (2020)

期刊

JOURNAL OF HEALTHCARE INFORMATICS RESEARCH

卷 4, 期 3, 页码 295-307

出版社

SPRINGERNATURE

DOI: 10.1007/s41666-020-00073-5

关键词

Missing data imputation; Electronic health records; Similarity-based imputation

类别

Computer Science, Information Systems Health Care Sciences & Services Medical Informatics

资金

National Science Foundation [NSF-1741306, IIS-1650531, DIBBs-1443019]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Using electronic health records (EHR) as the source of data for mining and analysis of different health conditions has become an increasingly common approach. However, due to irregular observation times and other uncertainties inherent in medical settings, the EHR data sets suffer from a large number of missing values. Most of the traditional data mining and machine learning approaches are designed to operate on complete data. In this paper, we propose a novel imputation method for missing data to facilitate using these approaches for the analysis of EHR data. The imputation is based on a set of interpatient, multivariate similarities among patients. For a missing data point in a patient's lab results during his/her intensive care unit stay, the method ranks other patients based on their similarities with the ego patient in terms of lab values, then the missing value is estimated as a weighted average of the known values of the same laboratory test from other patients, considering their similarities as weights. A comparison of the estimated values by the proposed method with values estimated by several common and state-of-the-are methods, such as MICE and 3D-MICE, shows that the proposed method outperforms them and produces promising results.

Imputation of Missing Data in Electronic Health Records Based on Patients' Similarities

期刊

JOURNAL OF HEALTHCARE INFORMATICS RESEARCH

出版社

SPRINGERNATURE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Imputation of Missing Data in Electronic Health Records Based on Patients' Similarities

期刊

JOURNAL OF HEALTHCARE INFORMATICS RESEARCH

出版社

SPRINGERNATURE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文