4.7 Article

Detecting group concept drift from multiple data streams

Journal

PATTERN RECOGNITION
Volume 134, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2022.109113

Keywords

Concept drift; Data streams; Online learning; Hypothesis test

Ask authors/readers for more resources

This paper focuses on concept drift across multiple data streams, particularly in situations where the drift of each data stream cannot be detected in time due to slight underlying distribution drifts. The authors propose a method that constructs a distribution free test statistic and designs an online learning algorithm to detect concept drift.
Concept drift may lead to a sharp downturn in the performance of streaming in data-based algorithms, caused by unforeseeable changes in the underlying distribution of data. In this paper, we are mainly concerned with concept drift across multiple data streams, and in situations where the drift of each data stream cannot be detected in time, due to slight underlying distribution drifts. We call this group concept drift. When compared to the detection of concept drift for a single data stream, the challenges of detecting group concept drift arise from three aspects: first, the training data become more complex; second, the underlying distribution becomes more complex; and third, the correlations between data streams become more complex. To address these challenges, the key idea of our method is to construct a distribution free test statistic, free from any underlying distribution in multiple data streams. Then, for streaming data, we design an online learning algorithm to obtain this test statistic, thereby determining the concept drift caused by the hypothesis test. The experiment evaluations with both synthetic and realworld datasets prove that our method can accurately detect concept drift from multiple data streams.(c) 2022 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available