Journal
EXPERT SYSTEMS WITH APPLICATIONS
Volume 150, Issue -, Pages -Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.113269
Keywords
Instance selection; Ranking; Instance-based learning; k-nearest neighbor; Classification
Categories
Funding
- CAPES (Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior)
- CNPq (Conselho Nacional de Desenvolvimento Cientifico e Tecnologico)
- FACEPE (Fundacao de Amparo aCiencia e Tecnologia de Pernambuco)
Ask authors/readers for more resources
In instance-based learning algorithms, the need to store a large number of examples as the training set results in several drawbacks related to large memory requirements, oversensitivity to noise, and slow execution speed. Instance selection techniques can improve the performance of these algorithms by selecting the best instances from the original data set, removing, for example, redundant information and noisy points. The relationship between an instance and the other patterns in the training set plays an important role and can impact its misclassification by learning algorithms. Such a relationship can be represented as a value that measures how difficult such instance is regarding classification purposes. Based on that, we introduce a novel instance selection algorithm called Ranking-based Instance Selection (RIS) that attributes a score per instance that depends on its relationship with all other instances in the training set. In this sense, instances with higher scores form safe regions (neighborhood of samples with relatively homogeneous class labels) in the feature space, and instances with lower scores form an indecision region (borderline samples of different classes). This information is further used in a selection process to remove instances from both safe and indecision regions that are considered irrelevant to represent their clusters in the feature space. In contrast to previous algorithms, the proposal combines a raking procedure with a selection process aiming to find a promising tradeoff between accuracy and reduction rate. Experiments are conducted on twenty-four real-world classification problems and show the effectiveness of the RIS algorithm when compared against other instance selection algorithms in the literature. (C) 2020 Elsevier Ltd. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available