☆ 4.5 Article

Identifying and Estimating Persistent Items in Data Streams

IEEE-ACM TRANSACTIONS ON NETWORKING (2018)

期刊

IEEE-ACM TRANSACTIONS ON NETWORKING

卷 26, 期 6, 页码 2429-2442

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNET.2018.2865125

关键词

Persistent item; identification; estimation; Bloom filter; Raptor code

类别

Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic Telecommunications

资金

National Natural Science Foundation of China [61502229, 61872178, 61373130, 61672276, 61472184, 61321491]
Fundamental Research Funds for the Central Universities [021014380079]
Key Research and Development Project of Jiangsu Province [BE2015154, BE2016120]
Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing University
National Science Foundation [CNS-1616273, CNS-1616317]
Jiangsu Innovation and Entrepreneurship (Shuangchuang) Program

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This paper addresses the fundamental problem of finding persistent items and estimating the number of times each persistent item occurred in a given data stream during a given period of time at any given observation point. We propose a novel scheme, PIE, that can not only accurately identify each persistent item with a probability greater than any desired false negative rate (FNR), but can also accurately estimate the number of occurrences of each persistent item. The key idea of PIE is that it uses Raptor codes to encode the ID of each item that appears at the observation point during a measurement period and stores only a few bits of the encoded ID in the memory. The item that is persistent occurs in enough measurement periods that enough encoded bits for the ID can be retrieved from the observation point to decode them correctly and get the ID of the persistent item. To estimate the number of occurrences of any given persistent item, PIE uses maximum likelihood estimation-based statistical techniques on the information already recorded during the measurement periods. We implemented and evaluated PIE using three real network traffic traces and compared its performance with three prior schemes. Our results show that PIE not only achieves the desire FNR in every scenario, its average FNR can be 19.5 times smaller than the FNR of the adapted prior scheme. Our results also show that PIE achieves any desired success probability in estimating the number of occurrences of persistent items.

Identifying and Estimating Persistent Items in Data Streams

期刊

IEEE-ACM TRANSACTIONS ON NETWORKING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Identifying and Estimating Persistent Items in Data Streams

期刊

IEEE-ACM TRANSACTIONS ON NETWORKING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文