4.5 Article

Missing data imputation by K nearest neighbours based on grey relational structure and mutual information

Journal

APPLIED INTELLIGENCE
Volume 43, Issue 3, Pages 614-632

Publisher

SPRINGER
DOI: 10.1007/s10489-015-0666-x

Keywords

Missing data; Grey theory; Mutual information; Feature relevance; K nearest neighbours

Funding

  1. National Natural Science Foundation of China [71172219, 71302056]
  2. Humanity and Social Science Youth Foundation of Ministry of Education, China [10YJC630352]
  3. Research Foundation of Education Department of Anhui Province of China [SK2012B578]

Ask authors/readers for more resources

Treatment of missing data has become increasingly significant in scientific research and engineering applications. The classic imputation strategy based on the K nearest neighbours (KNN) has been widely used to solve the plague problem. However, former studies do not give much attention to feature relevance, which has a significant impact on the selection of nearest neighbours. As a result, biased results may appear in similarity measurements. In this paper, we propose a novel method to impute missing data, named feature weighted grey KNN (FWGKNN) imputation algorithm. This approach employs mutual information (MI) to measure feature relevance. We present an experimental evaluation for five UCI datasets in three missingness mechanisms with various missing rates. Experimental results show that feature relevance has a non-ignorable influence on missing data estimation based on grey theory, and our method is considered superior to the other four estimation strategies. Moreover, the classification bias can be significantly reduced by using our approach in classification tasks.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available