4.7 Article

Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques

期刊

KNOWLEDGE-BASED SYSTEMS
卷 53, 期 -, 页码 51-65

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2013.08.023

关键词

Data pre-processing; Data cleansing; Missing value imputation; Decision tree algorithm; Decision forest algorithm; EM algorithm

向作者/读者索取更多资源

We present two novel techniques for the imputation of both categorical and numerical missing values. The techniques use decision trees and forests to identify horizontal segments of a data set where the records belonging to a segment have higher similarity and attribute correlations. Using the similarity and correlations, missing values are then imputed. To achieve a higher quality of imputation some segments are merged together using a novel approach. We use nine publicly available data sets to experimentally compare our techniques with a few existing ones in terms of four commonly used evaluation criteria. The experimental results indicate a clear superiority of our techniques based on statistical analyses such as confidence interval. (C) 2013 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据