4.7 Article

Evolving rule-based classifiers with genetic programming on GPUs for drifting data streams

Journal

PATTERN RECOGNITION
Volume 87, Issue -, Pages 248-268

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.patcog.2018.10.024

Keywords

Machine learning; Data streams; Concept drift; Genetic programming; Rule-based classification; GPU; High-performance data mining

Funding

  1. 2018 VCU Presidential Research Quest Fund
  2. Amazon AWS Machine Learning Research award

Ask authors/readers for more resources

Designing efficient algorithms for mining massive high-speed data streams has become one of the contemporary challenges for the machine learning community. Such models must display highest possible accuracy and ability to swiftly adapt to any kind of changes, while at the same time being characterized by low time and memory complexities. However, little attention has been paid to designing learning systems that will allow us to gain a better understanding of incoming data. There are few proposals on how to design interpretable classifiers for drifting data streams, yet most of them are characterized by a significant trade-off between accuracy and interpretability. In this paper, we show that it is possible to have all of these desirable properties in one model. We introduce ERulesD(2)S: evolving rule-based classifier for drifting data Streams. By using grammar-guided genetic programming, we are able to obtain accurate sets of rules per class that are able to adapt to changes in the stream without a need for an explicit drift detector. Additionally, we augment our learning model with new proposals for rule propagation and data stream sampling, in order to maintain a balance between learning and forgetting of concepts. To improve efficiency of mining massive and non-stationary data, we implement ERulesD(2)S parallelized on GPUs. A thorough experimental study on 30 datasets proves that ERulesD(2)S is able to efficiently adapt to any type of concept drift and outperform state-of-the-art rule-based classifiers, while using small number of rules. At the same time ERulesD(S)(2) is highly competitive to other single and ensemble learners in terms of accuracy and computational complexity, while offering fully interpretable classification rules. Additionally, we show that ERulesD(2)S can scale-up efficiently to high-dimensional data streams, while offering very fast update and classification times. Finally, we present the learning capabilities of ERulesD(2)S for sparsely labeled data streams. (C) 2018 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available