4.6 Article

Efficient Discovery of Periodic-Frequent Patterns in Columnar Temporal Databases

期刊

ELECTRONICS
卷 10, 期 12, 页码 -

出版社

MDPI
DOI: 10.3390/electronics10121478

关键词

data mining; pattern mining; periodic-frequent patterns; columnar databases

资金

  1. JSPS Kakenhi [21K12034]
  2. Grants-in-Aid for Scientific Research [21K12034] Funding Source: KAKEN

向作者/读者索取更多资源

Discovering periodic-frequent patterns in temporal databases is challenging, with most algorithms using horizontal database layout leading to inefficiencies. Vertical database layout is important as real-world big data is often stored this way. The proposed PF-ECLAT algorithm demonstrates memory and runtime efficiency, scalability, and usefulness in case studies analyzing air pollution and traffic congestion.
Discovering periodic-frequent patterns in temporal databases is a challenging problem of great importance in many real-world applications. Though several algorithms were described in the literature to tackle the problem of periodic-frequent pattern mining, most of these algorithms use the traditional horizontal (or row) database layout, that is, either they need to scan the database several times or do not allow asynchronous computation of periodic-frequent patterns. As a result, this kind of database layout makes the algorithms for discovering periodic-frequent patterns both time and memory inefficient. One cannot ignore the importance of mining the data stored in a vertical (or columnar) database layout. It is because real-world big data is widely stored in columnar database layout. With this motivation, this paper proposes an efficient algorithm, Periodic Frequent-Equivalence CLass Transformation (PF-ECLAT), to find periodic-frequent patterns in a columnar temporal database. Experimental results on sparse and dense real-world and synthetic databases demonstrate that PF-ECLAT is memory and runtime efficient and highly scalable. Finally, we demonstrate the usefulness of PF-ECLAT with two case studies. In the first case study, we have employed our algorithm to identify the geographical areas in which people were periodically exposed to harmful levels of air pollution in Japan. In the second case study, we have utilized our algorithm to discover the set of road segments in which congestion was regularly observed in a transportation network.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据