4.6 Article

FlexSketch: Estimation of Probability Density for Stationary and Non-Stationary Data Streams

期刊

SENSORS
卷 21, 期 4, 页码 -

出版社

MDPI
DOI: 10.3390/s21041080

关键词

probability density estimation; streaming data; sensor system

资金

  1. Samsung Science and Technology Foundation [SSTF-BA1501-52]
  2. Samsung Research Funding & Incubation Center of Samsung Electronics [SRFC-IT1801-10]

向作者/读者索取更多资源

Efficient and accurate estimation of probability distribution for non-stationary data streams is a crucial problem in sensor systems, requiring agile adaptation for concept drift. The proposed FlexSketch algorithm utilizes an ensemble of histograms to generate probability distribution, swiftly detecting and responding to concept drift, achieving high update speed and accuracy with limited memory. Experimental results show improved speed and accuracy compared to existing methods for both stationary and non-stationary data streams.
Efficient and accurate estimation of the probability distribution of a data stream is an important problem in many sensor systems. It is especially challenging when the data stream is non-stationary, i.e., its probability distribution changes over time. Statistical models for non-stationary data streams demand agile adaptation for concept drift while tolerating temporal fluctuations. To this end, a statistical model needs to forget old data samples and to detect concept drift swiftly. In this paper, we propose FlexSketch, an online probability density estimation algorithm for data streams. Our algorithm uses an ensemble of histograms, each of which represents a different length of data history. FlexSketch updates each histogram for a new data sample and generates probability distribution by combining the ensemble of histograms while monitoring discrepancy between recent data and existing models periodically. When it detects concept drift, a new histogram is added to the ensemble and the oldest histogram is removed. This allows us to estimate the probability density function with high update speed and high accuracy using only limited memory. Experimental results demonstrate that our algorithm shows improved speed and accuracy compared to existing methods for both stationary and non-stationary data streams.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据