☆ 4.7 Article

Missing Value Estimation for Mixed-Attribute Data Sets

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2011)

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

卷 23, 期 1, 页码 110-121

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2010.99

关键词

Classification; data mining; methodologies; machine learning

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems Engineering, Electrical & Electronic

资金

Australian Research Council (ARC) [DP0985456]
Nature Science Foundation (NSF) of China [90718020, 10661003]
China 973 Program [2008CB317108]
Key Research Institute of Humanities and Social Sciences at Universities [07JJD720044]
Guangxi NSF
Guangxi Colleges' Innovation Group

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Missing data imputation is a key issue in learning from incomplete data. Various techniques have been developed with great successes on dealing with missing values in data sets with homogeneous attributes (their independent attributes are all either continuous or discrete). This paper studies a new setting of missing data imputation, i.e., imputing missing data in data sets with heterogeneous attributes (their independent attributes are of different types), referred to as imputing mixed-attribute data sets. Although many real applications are in this setting, there is no estimator designed for imputing mixed-attribute data sets. This paper first proposes two consistent estimators for discrete and continuous missing target values, respectively. And then, a mixture-kernel-based iterative estimator is advocated to impute mixed-attribute data sets. The proposed method is evaluated with extensive experiments compared with some typical algorithms, and the result demonstrates that the proposed approach is better than these existing imputation methods in terms of classification accuracy and root mean square error (RMSE) at different missing ratios.

Missing Value Estimation for Mixed-Attribute Data Sets

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Missing Value Estimation for Mixed-Attribute Data Sets

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文