4.7 Article

Mining Top-k Co-Occurrence Patterns across Multiple Streams

期刊

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2017.2728537

关键词

Top-k co-occurrence patterns; multiple streams

资金

  1. JSPS [JP26240013, JP16K16056]
  2. JST, Strategic International Collaborative Research Program, SICORP
  3. Grants-in-Aid for Scientific Research [26240013, 16K16056] Funding Source: KAKEN

向作者/读者索取更多资源

The recent Bigdata and IoT era has presented a number of applications that generate objects in a streaming fashion. It is well-known that real-time mining of important patterns from data streams support many domains. In retail markets and social network services, for example, such patterns are itemsets and words that frequently appear in many user-accounts, i.e., co-occurrence patterns. To efficiently monitor co-occurrence patterns, we address the novel problem of mining top-k closed co-occurrence patterns across multiple streams. We employ sliding window setting in this problem, and each pattern is ranked based on count, which is the number of streams that have generated the pattern. Since objects are consecutively generated and deleted, the count of a given pattern is dynamic, which may change the rank of the pattern. This renders a challenge to monitoring the top-k answer in real-time. We propose an index-based algorithm that addresses the challenge and provides the exact answer. Specifically, we propose the CP-Graph, a hybrid index of graph and inverted file structures. The CP-Graph can efficiently compute the count of a given pattern and update the answer while pruning unnecessary patterns. Our experimental study on real datasets demonstrates the efficiency and scalability of our solution.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据