4.6 Article

Dm-KDE: dynamical kernel density estimation by sequences of KDE estimators with fixed number of components over data streams

期刊

FRONTIERS OF COMPUTER SCIENCE
卷 8, 期 4, 页码 563-580

出版社

HIGHER EDUCATION PRESS
DOI: 10.1007/s11704-014-3105-y

关键词

kernel density estimation; Kullback-Leibler divergence; data streams; kernel width; time and space complexity

资金

  1. National Natural Science Foundation of China [61170122, 61272210]
  2. Japan Society for the Promotion of Sciences (JSPS)
  3. Natural Science Foundation of Jiangsu Province [BK2011417, BK2011003]
  4. Jiangsu 333 Expert Engineering Grant [BRA2011142]
  5. Postgraduate Student's Creative Research Funds of Jiangsu Province [CXZZ11-0483, CXZZ12-0759]

向作者/读者索取更多资源

In many data stream mining applications, traditional density estimation methods such as kernel density estimation, reduced set density estimation can not be applied to the density estimation of data streams because of their high computational burden, processing time and intensive memory allocation requirement. In order to reduce the time and space complexity, a novel density estimation method Dm-KDE over data streams based on the proposed algorithm m-KDE which can be used to design a KDE estimator with the fixed number of kernel components for a dataset is proposed. In this method, Dm-KDE sequence entries are created by algorithm m-KDE instead of all kernels obtained from other density estimation methods. In order to further reduce the storage space, Dm-KDE sequence entries can be merged by calculating their KL divergences. Finally, the probability density functions over arbitrary time or entire time can be estimated through the obtained estimation model. In contrast to the state-of-the-art algorithm SOMKE, the distinctive advantage of the proposed algorithm Dm-KDE exists in that it can achieve the same accuracy with much less fixed number of kernel components such that it is suitable for the scenarios where higher on-line computation about the kernel density estimation over data streams is required.We compare Dm-KDE with SOMKE and M-kernel in terms of density estimation accuracy and running time for various stationary datasets. We also apply Dm-KDE to evolving data streams. Experimental results illustrate the effectiveness of the proposed method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据