☆ 4.7 Article

A generalized weighted distance k- Nearest Neighbor for multi-label problems

PATTERN RECOGNITION (2021)

Journal

PATTERN RECOGNITION

Volume 114, Issue -, Pages -

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2020.107526

Keywords

Multi-label classification; Binary relevance; Nearest neighbor; Adaptive distance measure; Prototype weighting

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The generalized Prototype Weighting (PW) scheme introduced in this paper supports various objective functions including F-measure, and is designed to significantly improve performance in multi-label classification by using gradient descent to specify parameters.

In multi-label classification, each instance is associated with a set of pre-specified labels. One common approach is to use Binary Relevance (BR) paradigm to learn each label by a base classifier separately. Use of k-Nearest Neighbor (kNN) as the base classifier (denoted as BRkNN) is a simple, descriptive and powerful approach. In binary relevance a highly imbalanced view of dataset is used. However, kNN is known to perform poorly on imbalanced data. One approach to deal with this is to define the distance function in a parametric form and use the training data to adjust the parameters (i.e. adjusting boundaries between classes) by optimizing a performance measure customized for imbalanced data e.g. F-measure. Prototype Weighting (PW) scheme presented in the literature (Paredes & Vidal, 2006) uses gradient descent to specify the parameters by minimizing the classification error-rate on training data. This paper presents a generalized version of PW. First, instead of minimizing the error-rate proposed in PW, the generalized PW supports also other objective functions that use elements of confusion matrix (including F-measure). Second, PW originally presented for 1NN is extended to the general case of kNN (i.e., k > = 1 ). For problems having highly overlapped classes, it is expected to perform better since a value of k > 1 produces smoother decision boundaries which in turn can improve generalization. In multi-label problems with many labels or problems with highly overlapped classes, the proposed generalized PW is expected to significantly improve the performance as it involves many decision boundaries. The performance of the proposed method has been compared with state-of-the-art methods in multi-label classification containing 6 lazy classifiers based on kNN. Experiments show that the proposed method significantly outperforms other methods. (c) 2020 Published by Elsevier Ltd.

A generalized weighted distance k- Nearest Neighbor for multi-label problems

Journal

PATTERN RECOGNITION

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A generalized weighted distance k- Nearest Neighbor for multi-label problems

Journal

PATTERN RECOGNITION

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper