4.6 Article

Incremental density-based ensemble clustering over evolving data streams

Journal

NEUROCOMPUTING
Volume 191, Issue -, Pages 34-43

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2016.01.009

Keywords

Ensemble clustering; Data streams; Density-based clustering; Smart grid

Funding

  1. National Natural Science Foundation of China [61473194]
  2. Science and Technology Planning Project of Guangdong Province of China [2013B091300019]
  3. Peacock Plan [KOCX201208161601439]

Ask authors/readers for more resources

The recent advances in smart meter technology have enabled for collecting information about customer power consumption in real time. The measurements are generated continuously and in some cases, e.g. in the industrial smart metering the data exchange rates are highly-fluctuating. The storage, querying, and mining of such smart meter streaming data with a large number of missing and sparse values are highly computationally challenging tasks. To address such matters, we propose a new method called incremental density-based ensemble clustering (IDEStream) for incremental segmentation of various kinds of factories based on their electricity consumption data. It exploits a gamma mixture model to suppress the influence of sparse data units in the data streams that sequentially arrive within a time window and then generates a clustering from the processed data of that window. IDEStream uses a unique incremental ensemble approach to incrementally aggregate the clusterings of subsequent time windows. Experimental results on data streams collected by smart meters from manufacturing factories in Guangdong province of China have shown that the proposed algorithm outperforms several state-ofthe-art data stream clustering algorithms. The obtained segmentation can find numerous applications, an exemplar one being to define customer rates in a flexible way. (C) 2016 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available