4.5 Article

An efficient utility-list based high-utility itemset mining algorithm

期刊

APPLIED INTELLIGENCE
卷 53, 期 6, 页码 6992-7006

出版社

SPRINGER
DOI: 10.1007/s10489-022-03850-4

关键词

Data mining; Pattern mining; High-utility itemset mining; Simplified utility-list

向作者/读者索取更多资源

High-utility itemset mining is an important task in data mining for retrieving meaningful patterns. Existing algorithms suffer from storage and time overheads. To address this, we propose an efficient algorithm based on simplified utility-list structure, which effectively reduces the number of candidates, memory usage, and execution time by introducing techniques like simplified utility-list, repeated pruning, and extension utility.
High-utility itemset mining (HUIM) is an important task in data mining that can retrieve more meaningful and useful patterns for decision-making. One-phase HUIM algorithms based on the utility-list structure have been shown to be the most efficient as they can mine high-utility itemsets (HUIs) without generating candidates. However, storing itemset information for the utility-list is time-consuming and memory consuming. To address this problem, we propose an efficient simplified utility-list-based HUIM algorithm (HUIM-SU). In the proposed HUIM-SU algorithm, the simplified utility-list is proposed to obtain all HUIs effectively and reduce memory usage in the depth-first search process. Based on the the simplified utility-list, repeated pruning according to the transaction-weighted utilisation (TWU) reduces the number of items. In addition, a construction tree and compressed storage are introduced to further reduce the search space and the memory usage. The extension utility and itemset TWU are then proposed to be the upper bounds, which reduce the search space considerably. Extensive experimental results on dense and sparse datasets indicate that the proposed HUIM-SU algorithm is highly efficient in terms of the number of candidates, memory usage, and execution time.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据