4.7 Article

Smooth Soft-Balance Discriminative Analysis for imbalanced data

期刊

KNOWLEDGE-BASED SYSTEMS
卷 228, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2020.106604

关键词

Imbalance classification; Smoothing; Soft-balanced clustering; Discriminative analysis

资金

  1. National Natural Science Foundation of China [61822601, 61773050, 61632004]
  2. Beijing Natural Science Foundation, China [Z180006]
  3. National Key Research and Development Program, China [2017YFC1703506]
  4. Fundamental Research Funds for the Central Universities, China [2018JBZ006, 2019JBZ110, 2019YJS040]
  5. Science and technology innovation planning foundation of colleges and Universities under the Ministry of Education

向作者/读者索取更多资源

Imbalance classification is a challenging research topic in machine learning where discriminative features are difficult to acquire. This study introduces a Smooth Soft-Balance Discriminative Analysis method to preprocess underrepresented data and mine the structure of majority classes, achieving better classification performance compared to state-of-the-art methods.
Imbalance classification is a challenging research topic in the community of machine learning, in which it is difficult to acquire the discriminative features. To date, a series of methods have been proposed but they still suffer from the following issues. The first issue is caused by the underrepresented data where the boundaries between classes are not clear. The second one is the complex structure in majority classes. To address these two issues, a Smooth Soft-Balance Discriminative Analysis method (S(2)BDA) is proposed to deal with imbalanced data. Among it, the underrepresented data is preprocessed via a smoothing technique so that the compact representation of each class can be obtained to make the boundaries between classes more explicit. To mine the structure of majority classes meanwhile keep the pattern hidden in the minority class, a soft-balance clustering model is designed to determine the subclasses from the majority class. Based on the balanced subclasses, S(2)BDA takes advantage of subclass-aware discriminant analysis to extract the discriminative features for imbalanced data classification. Extensive experiments are conducted on two synthetic data sets and sixteen real -world data sets with various imbalance ratios (from 4 to 39.18), data sizes (from 132 to 20000), number of categories (from 2 to 9) and dimensionalities (from 4 to 178). The experimental results have demonstrated that S(2)BDA outperforms the state-of-the-art methods in terms of the widely used evaluation metrics. (C) 2020 Published by Elsevier B.V.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据