4.7 Article

Incremental frequent itemsets mining based on frequent pattern tree and multi-scale

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 163, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2020.113805

关键词

Frequent itemsets mining; Multi-scale; Incremental mining; Frequent pattern tree; Association rules

资金

  1. National Natural Science Foundation of P.R. China [61602335]
  2. Natural Science Foundation of Shanxi Province, P.R. China [201901D211302]
  3. Taiyuan University of Science and Technology Scientific Research Initial Funding of Shanxi Province, P. R. China [20172017]
  4. Scientific and Technological Innovation Team of Shanxi Province, P. R. China [201805D131007]

向作者/读者索取更多资源

The article introduces an incremental frequent itemsets mining algorithm based on multi-scale theory called FPMSIM, which constructs a pattern tree using the classic FP-Growth to improve mining efficiency and reduce I/O costs.
Multi-scale can reveal the structure and hierarchical characteristics of the data objects to reflect their essence from different perspectives and levels. An incremental frequent itemsets mining algorithm based on frequent pattern tree is proposed by incorporating multi-scale theory(simplified to FP-tree and Multi-Scale based Incremental Mining, FPMSIM). FPMSIM uses the classic FP-Growth to construct a pattern tree and generate frequent itemsets for more fine-grained dataset which is called benchmark scale dataset. The newly added dataset is also independently mined as a benchmark scale dataset. The ultimate frequent itemsets for the target scales are derived by means of the scale-up process. In which, some unknown itemsets counts need to be estimated by comparing the similarity among benchmark scale datasets. In this way, severe dataset rescanning and tree structure adjustment overhead are avoided during the maintenance process. The experimental results show that although the support estimation error will lead to incomplete frequent itemsets mining, it can be offset by the performance gains in the mining efficiency and I/O cost, especially in the field of big data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据