☆ 4.6 Article

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

IEEE ACCESS (2018)

期刊

IEEE ACCESS

卷 6, 期 -, 页码 18655-18669

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2018.2820740

关键词

High average-utility pattern; utility mining; tighter upper-bound; pruning strategy; data mining

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

资金

National Natural Science Foundation of China [61503092]
Shenzhen Technical Project [KQJSCX20170726103424709, JCYJ20170307151733005]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

High-utility itemset mining (HUIM) has been gaining popularity in the field of data mining. Frequent itemset mining used to be the main tool to reveal high-frequency patterns but failed to consider the concept of profit. HUIM, on the other hand, obtains the itemsets and is practical in commercial applications. A main challenge in HUIM is that HUIM should handle the exponential search space for HUIM when the number of distinct items and the size of the database are both too large. The other challenge is that existing HUIM methods overlook the length of high-utility itemsets; hence, a large itemset gets an unreasonable estimated profit as opposed to the actual value. Therefore, several algorithms were proposed to mine high average-utility itemsets. High average-utility itemset mining (HAUIM) is an extension for the traditional HUIM, which provides a different measure with HUIM. It discovers utility patterns by considering both their utilities and lengths. To reduce the searching space in HAUIM, average-utility upper-bound, looser upper-bound utility, and a revised tighter upper-bound model are proposed to prune the searching graph in HAUIM. These three upper-bounds for high average-utility itemsets decrease the number of candidate patterns efficiently. However, they still overestimate a high average-utility itemset and waste on assessing the unnecessary patterns. Two new tighter upper-bounds, maximum following utility upper-bound and top-k transaction-maximum utility upper-bound, are proposed in this paper to further contract the size of candidate pattern set. Experiments conducted on several benchmark data sets show that the proposed method outperforms the previous HAUIM algorithms in terms of runtime, the number of join operations, and scalability.

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文