4.7 Article

Sparse projection infinite selection ensemble for imbalanced classification

期刊

KNOWLEDGE-BASED SYSTEMS
卷 262, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2022.110246

关键词

Imbalanced classification; Graph -based methods; Random projections; Ensemble learning

向作者/读者索取更多资源

Imbalanced datasets pose frequent and challenging problems in real-world applications, where classification models tend to be biased towards the majority class. This paper proposes a novel framework called SPISE, which addresses the imbalanced learning problem by iteratively resampling balanced subsets and combining classifiers trained on these subsets. It takes into account the diversity of classifier ensembles and the similarity between subsets and the whole dataset.
Imbalanced datasets pose frequent and challenging problems to many real-world applications. Clas-sification models are often biased towards the majority class when learning from class-imbalanced data. Typical imbalanced learning (IL) approaches, e.g., SMOTE, AdaCost, and Cascade, often suffer from poor performance in complex tasks where class overlapping or a high imbalance ratio occurs. In this paper, we systematically investigate the IL problem and propose a novel framework named sparse projection infinite selection ensemble (SPISE). SPISE iteratively resamples balanced subsets and combines the classifiers trained on these subsets for imbalanced classification. The diversity of classifier ensembles and the similarity between the subsets and the whole dataset are considered in this process. Specifically, we present a graph-based approach named infinite subset selection to adaptively sample diverse and similar subsets. Additionally, a random sparse projection is combined with feature selection at the beginning of each iteration to augment the training features and enhance the diversity of the generated subsets. SPISE can be easily adapted to most existing classifiers (e.g., support vector machine and random forest) to boost their performance for IL. Quantitative experiments on 26 imbalanced benchmark datasets substantiate the effectiveness and superiority of the proposed model compared with other popular approaches.(c) 2023 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据