4.7 Article

Multi-label sampling based on local label imbalance

Journal

PATTERN RECOGNITION
Volume 122, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2021.108294

Keywords

Multi-label learning; Class imbalance; Oversampling and undersampling; Local label imbalance; Ensemble methods

Funding

  1. China Scholarship Council (CSC) [201708500095]

Ask authors/readers for more resources

Class imbalance is a common challenge in multi-label data, and sampling techniques can be an effective strategy to address it. The imbalance level within the local neighborhood of minority class examples is crucial for performance degradation. The proposed sampling approaches, MLSOL and MLUL, are effective in alleviating the local label imbalance and improving performance on multi-label datasets.
Class imbalance is an inherent characteristic of multi-label data that hinders most multi-label learning methods. One efficient and flexible strategy to deal with this problem is to employ sampling techniques before training a multi-label learning model. Although existing multi-label sampling approaches alleviate the global imbalance of multi-label datasets, it is actually the imbalance level within the local neighbour-hood of minority class examples that plays a key role in performance degradation. To address this issue, we propose a novel measure to assess the local label imbalance of multi-label datasets, as well as two multi-label sampling approaches, namely Multi-Label Synthetic Oversampling based on Local label imbal-ance (MLSOL) and Multi-Label Undersampling based on Local label imbalance (MLUL). By considering all informative labels, MLSOL creates more diverse and better labeled synthetic instances for difficult exam-ples, while MLUL eliminates instances that are harmful to their local region. Experimental results on 13 multi-label datasets demonstrate the effectiveness of the proposed measure and sampling approaches for a variety of evaluation metrics, particularly in the case of an ensemble of classifiers trained on repeated samples of the original data. (c) 2021 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available