☆ 4.7 Article

Smooth Soft-Balance Discriminative Analysis for imbalanced data

KNOWLEDGE-BASED SYSTEMS (2021)

Journal

KNOWLEDGE-BASED SYSTEMS

Volume 228, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.knosys.2020.106604

Keywords

Imbalance classification; Smoothing; Soft-balanced clustering; Discriminative analysis

Funding

National Natural Science Foundation of China [61822601, 61773050, 61632004]
Beijing Natural Science Foundation, China [Z180006]
National Key Research and Development Program, China [2017YFC1703506]
Fundamental Research Funds for the Central Universities, China [2018JBZ006, 2019JBZ110, 2019YJS040]
Science and technology innovation planning foundation of colleges and Universities under the Ministry of Education

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Imbalance classification is a challenging research topic in machine learning where discriminative features are difficult to acquire. This study introduces a Smooth Soft-Balance Discriminative Analysis method to preprocess underrepresented data and mine the structure of majority classes, achieving better classification performance compared to state-of-the-art methods.

Imbalance classification is a challenging research topic in the community of machine learning, in which it is difficult to acquire the discriminative features. To date, a series of methods have been proposed but they still suffer from the following issues. The first issue is caused by the underrepresented data where the boundaries between classes are not clear. The second one is the complex structure in majority classes. To address these two issues, a Smooth Soft-Balance Discriminative Analysis method (S(2)BDA) is proposed to deal with imbalanced data. Among it, the underrepresented data is preprocessed via a smoothing technique so that the compact representation of each class can be obtained to make the boundaries between classes more explicit. To mine the structure of majority classes meanwhile keep the pattern hidden in the minority class, a soft-balance clustering model is designed to determine the subclasses from the majority class. Based on the balanced subclasses, S(2)BDA takes advantage of subclass-aware discriminant analysis to extract the discriminative features for imbalanced data classification. Extensive experiments are conducted on two synthetic data sets and sixteen real -world data sets with various imbalance ratios (from 4 to 39.18), data sizes (from 132 to 20000), number of categories (from 2 to 9) and dimensionalities (from 4 to 178). The experimental results have demonstrated that S(2)BDA outperforms the state-of-the-art methods in terms of the widely used evaluation metrics. (C) 2020 Published by Elsevier B.V.

Smooth Soft-Balance Discriminative Analysis for imbalanced data

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Smooth Soft-Balance Discriminative Analysis for imbalanced data

Journal

KNOWLEDGE-BASED SYSTEMS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper