4.5 Article

An adaptive XGBoost-based optimized sliding window for concept drift handling in non-stationary spatiotemporal data streams classifications

Journal

JOURNAL OF SUPERCOMPUTING
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s11227-023-05729-8

Keywords

Concept drift; Machine learning; Sliding windows; Spatiotemporal data streams; Bayesian optimization

Ask authors/readers for more resources

The popularity of using data science for decision-making has grown in recent years, but it has also brought a significant learning challenge known as concept drifting. This article introduces a new model called BOASWIN-XGBoost to handle concept drift and achieves impressive accuracy on seven streaming datasets.
In recent years, the popularity of using data science for decision-making has grown significantly. This rise in popularity has led to a significant learning challenge known as concept drifting, primarily due to the increasing use of spatial and temporal data streaming applications. Concept drift can have highly negative consequences, leading to the degradation of models used in these applications. A new model called BOASWIN-XGBoost (Bayesian Optimized Adaptive Sliding Window and XGBoost) has been introduced in this work to handle concept drift. This model is designed explicitly for classifying streaming data and comprises three main procedures: pre-processing, concept drift detection, and classification. The BOASWIN-XGBoost model utilizes a method called Bayesian-Optimized Adaptive Sliding Window (BOASWIN) to identify the presence of concept drift in the streaming data. Additionally, it employs an optimized XGBoost (eXtreme Gradient Boosting) model for classification purposes. The hyperparameter tuning approach known as BO-TPE (Bayesian Optimization with Tree-structured Parzen Estimator) is employed to fine-tune the XGBoost model's parameters, thus enhancing the classifier's performance. Seven streaming datasets were used to evaluate the proposed approach's performance, including Agrawal_a, Agrawal_g, SEA_a, SEA_g, Hyperplane, Phishing, and Weather. The simulation results demonstrate that the suggested model achieves impressive accuracy values of 70.83%, 71.02%, 76.76%, 76.96%, 84.26%, 95.53%, and 78.35% on the corresponding datasets, affirming its superior performance in handling concept drift and classifying streaming data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available