4.5 Article

Identifying and Estimating Persistent Items in Data Streams

期刊

IEEE-ACM TRANSACTIONS ON NETWORKING
卷 26, 期 6, 页码 2429-2442

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNET.2018.2865125

关键词

Persistent item; identification; estimation; Bloom filter; Raptor code

资金

  1. National Natural Science Foundation of China [61502229, 61872178, 61373130, 61672276, 61472184, 61321491]
  2. Fundamental Research Funds for the Central Universities [021014380079]
  3. Key Research and Development Project of Jiangsu Province [BE2015154, BE2016120]
  4. Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University
  5. National Science Foundation [CNS-1616273, CNS-1616317]
  6. Jiangsu Innovation and Entrepreneurship (Shuangchuang) Program

向作者/读者索取更多资源

This paper addresses the fundamental problem of finding persistent items and estimating the number of times each persistent item occurred in a given data stream during a given period of time at any given observation point. We propose a novel scheme, PIE, that can not only accurately identify each persistent item with a probability greater than any desired false negative rate (FNR), but can also accurately estimate the number of occurrences of each persistent item. The key idea of PIE is that it uses Raptor codes to encode the ID of each item that appears at the observation point during a measurement period and stores only a few bits of the encoded ID in the memory. The item that is persistent occurs in enough measurement periods that enough encoded bits for the ID can be retrieved from the observation point to decode them correctly and get the ID of the persistent item. To estimate the number of occurrences of any given persistent item, PIE uses maximum likelihood estimation-based statistical techniques on the information already recorded during the measurement periods. We implemented and evaluated PIE using three real network traffic traces and compared its performance with three prior schemes. Our results show that PIE not only achieves the desire FNR in every scenario, its average FNR can be 19.5 times smaller than the FNR of the adapted prior scheme. Our results also show that PIE achieves any desired success probability in estimating the number of occurrences of persistent items.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据