4.6 Article

An efficient instance selection algorithm for k nearest neighbor regression

Journal

NEUROCOMPUTING
Volume 251, Issue -, Pages 26-34

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2017.04.018

Keywords

Instance selection; Nearest neighbor; Regression; Data reduction; Significant difference

Funding

  1. National Natural Science Foundation of China [61432011, 61603230, U1435212]
  2. National Key Basic Research and Development Program of China (973) [2013CB329404]

Ask authors/readers for more resources

The k-Nearest Neighbor algorithm(kNN) is an algorithm that is very simple to understand for classification or regression. It is also a lazy algorithm that does not use the training data points to do any generalization, in other words, it keeps all the training data during the testing phase. Thus, the population size becomes a major concern for kNN, since large population size may result in slow execution speed and large memory requirements. To solve this problem, many efforts have been devoted, but mainly focused on kNN classification. And now we propose an algorithm to decrease the size of the training set for kNN regression(DISKR). In this algorithm, we firstly remove the outlier instances that impact the performance of regressor, and then sorts the left instances by the difference on output among instances and their nearest neighbors. Finally, the left instances with little contribution measured by the training error are successively deleted following the rule. The proposed algorithm is compared with five state-of-the-art algorithms on 19 datasets, and experiment results show it could get the similar prediction ability but have the lowest instance storage ratio. (C) 2017 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available