4.7 Article

Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams

Journal

INFORMATION FUSION
Volume 66, Issue -, Pages 138-154

Publisher

ELSEVIER
DOI: 10.1016/j.inffus.2020.09.004

Keywords

Dynamic ensemble selection; Imbalanced data; Data stream; Data preprocessing; Concept drift

Funding

  1. Polish National Science Centre [2017/27/B/ST6/01325]

Ask authors/readers for more resources

This work focuses on connecting two rarely combined research directions - non-stationary data stream classification and data analysis with skewed class distributions. By proposing a novel framework that employs stratified bagging for training base classifiers and integrating data preprocessing and dynamic ensemble selection methods, the study aims to improve the classification of imbalanced data streams.
This work aims to connect two rarely combined research directions, i.e., non-stationary data stream classification and data analysis with skewed class distributions. We propose a novel framework employing stratified bagging for training base classifiers to integrate data preprocessing and dynamic ensemble selection methods for imbalanced data stream classification. The proposed approach has been evaluated based on computer experiments carried out on 135 artificially generated data streams with various imbalance ratios, label noise levels, and types of concept drift as well as on two selected real streams. Four preprocessing techniques and two dynamic selection methods, used on both bagging classifiers and base estimators levels, were considered. Experimentation results showed that, for highly imbalanced data streams, dynamic ensemble selection coupled with data preprocessing could outperform online and chunk-based state-of-art methods.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available