4.6 Article

Hybrid Sampling and Dynamic Weighting-Based Classification Method for Multi-Class Imbalanced Data Stream

期刊

APPLIED SCIENCES-BASEL
卷 13, 期 10, 页码 -

出版社

MDPI
DOI: 10.3390/app13105924

关键词

data stream; multi-class imbalance; concept drift; hybrid sampling; classifier weighting

向作者/读者索取更多资源

In order to address the imbalance and concept drift problems in multi-class data streams, a Hybrid Sampling and Dynamic Weighted-based classification method for Multi-class Imbalanced data stream (HSDW-MI) is proposed. The HSDW-MI algorithm tackles the imbalance and concept drift problems through hybrid sampling and dynamic weighting phases respectively. Experimental results show that HSDW-MI outperforms other algorithms in terms of classification capabilities and consistency.
The imbalance and concept drift problems in data streams become more complex in multi-class environment, and extreme imbalance and variation in class ratio may also exist. To tackle the above problems, Hybrid Sampling and Dynamic Weighted-based classification method for Multi-class Imbalanced data stream (HSDW-MI) is proposed. The HSDW-MI algorithm deals with imbalance and concept drift problems through the hybrid sampling and dynamic weighting phases, respectively. In the hybrid sampling phase, adaptive spectral clustering is proposed to sample the data after clustering, which can maintain the original data distribution; then the sample safety factor is used to determine the samples to be sampled for each class; the safe samples are oversampled and the unsafe samples are under-sampled in each cluster. If the data stream is extremely imbalanced, the sample storage pool is used to extract samples with a high safety factor to add to the data stream. In the dynamic weighting phase, a dynamic weighting method based on the G-mean value is proposed. The G-mean values are used as the weights of each base classifier in the ensemble and the ensemble is dynamically updated during the processing of the data stream to accommodate the occurrence of concept drift. Experiments were conducted with LB, OAUE, ARF, BOLE, MUOB, MOOD, CALMID, and the proposed HSDW-MI on 10 multi-class synthetic data streams with different class ratios and concept drifts and 3 real multi-class imbalanced streams with unknown drifts, and the results show that the proposed HSDW-MI has better classification capabilities and performs more consistently compared to all other algorithms.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据