☆ 4.6 Article

Addressing imbalance in multilabel classification: Measures and random resampling algorithms

NEUROCOMPUTING (2015)

Journal

NEUROCOMPUTING

Volume 163, Issue -, Pages 3-16

Publisher

ELSEVIER

DOI: 10.1016/j.neucom.2014.08.091

Keywords

Multilabel classification; Imbalanced classification; Resampling algorithms; Undersampling; Oversampling

Funding

Spanish Ministry of Education under the FPU National Program [AP2010-0068]
Spanish Ministry of Science and Technology [TIN2012-33856, TIN2011-28488]
Andalusian Research Plan [P10-TIC-6858, P11-TIC-7765]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The purpose of this paper is to analyze the imbalanced learning task in the multilabel scenario, aiming to accomplish two different goals. The first one is to present specialized measures directed to assess the imbalance level in multilabel datasets (MLDs). Using these measures we will be able to conclude which MLDs are imbalanced, and therefore would need an appropriate treatment The second objective is to propose several algorithms designed to reduce the imbalance in MLDs in a classifier-independent way, by means of resampling techniques. Two different approaches to divide the instances in minority and majority groups are studied. One of them considers each label combination as class identifier, whereas the other one performs an individual evaluation of each label imbalance level. A random undersampling and a random oversampling algorithm are proposed for each approach, giving as result four different algorithms. All of them are experimentally tested and their effectiveness is statistically evaluated. From the results obtained, a set of guidelines directed to show when these methods should be applied is also provided. (C) 2015 Elsevier B.V. All rights reserved.

Addressing imbalance in multilabel classification: Measures and random resampling algorithms

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Addressing imbalance in multilabel classification: Measures and random resampling algorithms

Journal

NEUROCOMPUTING

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper