4.6 Article

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

Journal

IEEE ACCESS
Volume 6, Issue -, Pages 18655-18669

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2018.2820740

Keywords

High average-utility pattern; utility mining; tighter upper-bound; pruning strategy; data mining

Funding

  1. National Natural Science Foundation of China [61503092]
  2. Shenzhen Technical Project [KQJSCX20170726103424709, JCYJ20170307151733005]

Ask authors/readers for more resources

High-utility itemset mining (HUIM) has been gaining popularity in the field of data mining. Frequent itemset mining used to be the main tool to reveal high-frequency patterns but failed to consider the concept of profit. HUIM, on the other hand, obtains the itemsets and is practical in commercial applications. A main challenge in HUIM is that HUIM should handle the exponential search space for HUIM when the number of distinct items and the size of the database are both too large. The other challenge is that existing HUIM methods overlook the length of high-utility itemsets; hence, a large itemset gets an unreasonable estimated profit as opposed to the actual value. Therefore, several algorithms were proposed to mine high average-utility itemsets. High average-utility itemset mining (HAUIM) is an extension for the traditional HUIM, which provides a different measure with HUIM. It discovers utility patterns by considering both their utilities and lengths. To reduce the searching space in HAUIM, average-utility upper-bound, looser upper-bound utility, and a revised tighter upper-bound model are proposed to prune the searching graph in HAUIM. These three upper-bounds for high average-utility itemsets decrease the number of candidate patterns efficiently. However, they still overestimate a high average-utility itemset and waste on assessing the unnecessary patterns. Two new tighter upper-bounds, maximum following utility upper-bound and top-k transaction-maximum utility upper-bound, are proposed in this paper to further contract the size of candidate pattern set. Experiments conducted on several benchmark data sets show that the proposed method outperforms the previous HAUIM algorithms in terms of runtime, the number of join operations, and scalability.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available