☆ 4.6 Article

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

IEEE ACCESS (2018)

Journal

IEEE ACCESS

Volume 6, Issue -, Pages 18655-18669

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2018.2820740

Keywords

High average-utility pattern; utility mining; tighter upper-bound; pruning strategy; data mining

Funding

National Natural Science Foundation of China [61503092]
Shenzhen Technical Project [KQJSCX20170726103424709, JCYJ20170307151733005]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

High-utility itemset mining (HUIM) has been gaining popularity in the field of data mining. Frequent itemset mining used to be the main tool to reveal high-frequency patterns but failed to consider the concept of profit. HUIM, on the other hand, obtains the itemsets and is practical in commercial applications. A main challenge in HUIM is that HUIM should handle the exponential search space for HUIM when the number of distinct items and the size of the database are both too large. The other challenge is that existing HUIM methods overlook the length of high-utility itemsets; hence, a large itemset gets an unreasonable estimated profit as opposed to the actual value. Therefore, several algorithms were proposed to mine high average-utility itemsets. High average-utility itemset mining (HAUIM) is an extension for the traditional HUIM, which provides a different measure with HUIM. It discovers utility patterns by considering both their utilities and lengths. To reduce the searching space in HAUIM, average-utility upper-bound, looser upper-bound utility, and a revised tighter upper-bound model are proposed to prune the searching graph in HAUIM. These three upper-bounds for high average-utility itemsets decrease the number of candidate patterns efficiently. However, they still overestimate a high average-utility itemset and waste on assessing the unnecessary patterns. Two new tighter upper-bounds, maximum following utility upper-bound and top-k transaction-maximum utility upper-bound, are proposed in this paper to further contract the size of candidate pattern set. Experiments conducted on several benchmark data sets show that the proposed method outperforms the previous HAUIM algorithms in terms of runtime, the number of join operations, and scalability.

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

Journal

IEEE ACCESS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

TUB-HAUPM: Tighter Upper Bound for Mining High Average-Utility Patterns

Journal

IEEE ACCESS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper