☆ 4.5 Article

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

APPLIED INTELLIGENCE (2022)

期刊

APPLIED INTELLIGENCE

卷 52, 期 6, 页码 7002-7023

出版社

SPRINGER

DOI: 10.1007/s10489-020-02172-7

关键词

Frequent itemset mining; Frequent closed itemset mining; Lossless and condensed representation; Deriving algorithms

类别

Computer Science, Artificial Intelligence

资金

Ministry of Science and Technology, Taiwan [109-2221-E-197-027, 109-2634-F-009-026]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this study, two efficient algorithms, DFI-List and DFI-Growth, were proposed for efficiently deriving FIs from FCIs. DFI-List utilizes a vertical index structure called Cid List, while DFI-Growth compresses the information into tree structures and applies a pattern-growth strategy for FI derivation.

When mining frequent itemsets (abbr. FIs) from dense datasets, it usually produces too many itemsets and results in the mining task to suffer from a very long execution time and high memory consumption. Frequent closed itemset (abbr. FCI) is a compact and lossless representation of FI. Mining FCIs can not only reduce the execution time and memory usage, but also reserve the complete information of FIs derived from FCIs. Although many studies have been proposed with various efficient methods for mining FCIs, few of them have developed algorithms for efficiently deriving FIs from FCIs. In this work, we propose two efficient algorithms named DFI-List and DFI-Growth for efficiently deriving FIs from FCIs. The both algorithms adopt depth-first search and divide-and-conquer methodology to derive all the FIs. DFI-List efficiently derives all the FIs with a vertical index structure called Cid List. DFI-Growth compresses the information of FCIs into tree structures and applies pattern-growth strategy to derive FIs from the trees. Empirical experiments show that DFI-List is the most efficient and scalable algorithm on the dense datasets. For example, when the minimum support threshold is set to 50% on the Chess dataset, DFI-List runs faster than LevelWise (Pasquier et al. Inf Syst 24(1): 25-46, 1999b) over 100 times. As for DFI-Growth, it is the most stable and memory efficient algorithm on the sparse datasets. Both DFI-Growth and DFI-List are superior to the state-of-the-art algorithm (Pasquier et al. Inf Syst 24(1): 25-46, 199b) in terms of execution time.

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

期刊

APPLIED INTELLIGENCE

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Efficient algorithms for deriving complete frequent itemsets from frequent closed itemsets

期刊

APPLIED INTELLIGENCE

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文