☆ 4.7 Article

Cluster Kernels: Resource-aware kernel density estimators over streaming data

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2008)

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Volume 20, Issue 7, Pages 880-893

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2008.21

Keywords

data streams; stream mining; statistical modeling; kernel density estimation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

A variety of real-world applications heavily rely on an adequate analysis of transient data streams. Due to the rigid processing requirements of data streams, common analysis techniques as known from data mining are not directly applicable. A fundamental building block of many data mining and analysis approaches is density estimation. It provides a well-defined estimation of a continuous data distribution, a fact which makes its adaptation to data streams desirable. A convenient method for density estimation utilizes kernels. The computational complexity of kernel density estimation, however, renders its application to data streams impossible. In this paper, we tackle this problem and propose our Cluster Kernel approach, which provides continuously computed kernel density estimators over streaming data. Not only do Cluster Kernels meet the rigid processing requirements of data streams, but they also allocate only a constant amount of memory, even with the opportunity to adapt it dynamically to changing system resources. For this purpose, we develop an intelligent merge scheme for Cluster Kernels and utilize continuously collected local statistics to resample already processed data. We validate the efficacy of Cluster Kernels for a variety of real-world data streams in an extensive experimental study.

Cluster Kernels: Resource-aware kernel density estimators over streaming data

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Cluster Kernels: Resource-aware kernel density estimators over streaming data

Journal

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper