4.6 Article

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

期刊

IEEE ACCESS
卷 6, 期 -, 页码 18655-18669

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2018.2820740

关键词

High average-utility pattern; utility mining; tighter upper-bound; pruning strategy; data mining

资金

  1. National Natural Science Foundation of China [61503092]
  2. Shenzhen Technical Project [KQJSCX20170726103424709, JCYJ20170307151733005]

向作者/读者索取更多资源

High-utility itemset mining (HUIM) has been gaining popularity in the field of data mining. Frequent itemset mining used to be the main tool to reveal high-frequency patterns but failed to consider the concept of profit. HUIM, on the other hand, obtains the itemsets and is practical in commercial applications. A main challenge in HUIM is that HUIM should handle the exponential search space for HUIM when the number of distinct items and the size of the database are both too large. The other challenge is that existing HUIM methods overlook the length of high-utility itemsets; hence, a large itemset gets an unreasonable estimated profit as opposed to the actual value. Therefore, several algorithms were proposed to mine high average-utility itemsets. High average-utility itemset mining (HAUIM) is an extension for the traditional HUIM, which provides a different measure with HUIM. It discovers utility patterns by considering both their utilities and lengths. To reduce the searching space in HAUIM, average-utility upper-bound, looser upper-bound utility, and a revised tighter upper-bound model are proposed to prune the searching graph in HAUIM. These three upper-bounds for high average-utility itemsets decrease the number of candidate patterns efficiently. However, they still overestimate a high average-utility itemset and waste on assessing the unnecessary patterns. Two new tighter upper-bounds, maximum following utility upper-bound and top-k transaction-maximum utility upper-bound, are proposed in this paper to further contract the size of candidate pattern set. Experiments conducted on several benchmark data sets show that the proposed method outperforms the previous HAUIM algorithms in terms of runtime, the number of join operations, and scalability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据