4.7 Article

A generalized weighted distance k- Nearest Neighbor for multi-label problems

Journal

PATTERN RECOGNITION
Volume 114, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2020.107526

Keywords

Multi-label classification; Binary relevance; Nearest neighbor; Adaptive distance measure; Prototype weighting

Ask authors/readers for more resources

The generalized Prototype Weighting (PW) scheme introduced in this paper supports various objective functions including F-measure, and is designed to significantly improve performance in multi-label classification by using gradient descent to specify parameters.
In multi-label classification, each instance is associated with a set of pre-specified labels. One common approach is to use Binary Relevance (BR) paradigm to learn each label by a base classifier separately. Use of k-Nearest Neighbor (kNN) as the base classifier (denoted as BRkNN) is a simple, descriptive and powerful approach. In binary relevance a highly imbalanced view of dataset is used. However, kNN is known to perform poorly on imbalanced data. One approach to deal with this is to define the distance function in a parametric form and use the training data to adjust the parameters (i.e. adjusting boundaries between classes) by optimizing a performance measure customized for imbalanced data e.g. F-measure. Prototype Weighting (PW) scheme presented in the literature (Paredes & Vidal, 2006) uses gradient descent to specify the parameters by minimizing the classification error-rate on training data. This paper presents a generalized version of PW. First, instead of minimizing the error-rate proposed in PW, the generalized PW supports also other objective functions that use elements of confusion matrix (including F-measure). Second, PW originally presented for 1NN is extended to the general case of kNN (i.e., k > = 1 ). For problems having highly overlapped classes, it is expected to perform better since a value of k > 1 produces smoother decision boundaries which in turn can improve generalization. In multi-label problems with many labels or problems with highly overlapped classes, the proposed generalized PW is expected to significantly improve the performance as it involves many decision boundaries. The performance of the proposed method has been compared with state-of-the-art methods in multi-label classification containing 6 lazy classifiers based on kNN. Experiments show that the proposed method significantly outperforms other methods. (c) 2020 Published by Elsevier Ltd.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available