☆ 4.0 Article

An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data

DATA (2017)

期刊

DATA

卷 2, 期 1, 页码 -

出版社

MDPI

DOI: 10.3390/data2010008

关键词

missing value imputation; machine learning; decision tree imputation; k-nearest neighbors imputation; self-organizing map imputation

类别

Computer Science, Information Systems Multidisciplinary Sciences

资金

National Library of Medicine training grant [T15LM007059]
National Institute of General Medical Sciences [R01GM100387]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Many clinical research datasets have a large percentage of missing values that directly impacts their usefulness in yielding high accuracy classifiers when used for training in supervised machine learning. While missing value imputation methods have been shown to work well with smaller percentages of missing values, their ability to impute sparse clinical research data can be problem specific. We previously attempted to learn quantitative guidelines for ordering cardiac magnetic resonance imaging during the evaluation for pediatric cardiomyopathy, but missing data significantly reduced our usable sample size. In this work, we sought to determine if increasing the usable sample size through imputation would allow us to learn better guidelines. We first review several machine learning methods for estimating missing data. Then, we apply four popular methods (mean imputation, decision tree, k-nearest neighbors, and self-organizing maps) to a clinical research dataset of pediatric patients undergoing evaluation for cardiomyopathy. Using Bayesian Rule Learning (BRL) to learn ruleset models, we compared the performance of imputation-augmented models versus unaugmented models. We found that all four imputation- augmented models performed similarly to unaugmented models. While imputation did not improve performance, it did provide evidence for the robustness of our learned models.

An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data

期刊

DATA

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data

期刊

DATA

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文