4.7 Article

Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases

Journal

INFORMATION SCIENCES
Volume 587, Issue -, Pages 41-62

Publisher

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2021.12.017

Keywords

Cross-level itemsets; High-utility itemsets; Taxonomy; Hierarchical database; Data mining

Ask authors/readers for more resources

In contrast to frequent itemset mining algorithms, high-utility itemset mining algorithms focus on identifying the most profitable sets of items in transaction databases. However, most existing algorithms overlook item categorizations, which provide useful information in real-world transaction databases. To address this limitation, this study introduces a novel algorithm called FEACP, which efficiently identifies high-utility itemsets of different abstraction levels by incorporating effective pruning strategies. Performance evaluation shows that FEACP is significantly faster and reduces memory usage compared to state-of-the-art algorithms on both sparse and dense databases.
In contrast to frequent itemset mining (FIM) algorithms that focus on identifying itemsets with high occurrence frequency, high-utility itemset mining algorithms can reveal the most profitable sets of items in transaction databases. Several algorithms were proposed to perform the task efficiently. Nevertheless, most of them ignore the item categorizations. This useful information is provided in many real-world transaction databases. Previous works, such as CLH-Miner and ML-HUI Miner were proposed to solve this limitation to dis-cover cross-level and multi-level HUIs. However, the CLH-Miner has a long runtime and high memory usage. To address these drawbacks, this study extends tight upper bounds to propose effective pruning strategies. A novel algorithm named FEACP (Fast and Efficient Algorithm for Cross-level high-utility Pattern mining) is introduced, which adopts the proposed strategies to efficiently identify cross-level HUIs in taxonomy-based data-bases. It can be seen from a thorough performance evaluation that FEACP can identify use-ful itemsets of different abstraction levels in transaction databases with high efficiency, that is up to 8 times faster than the state-of-the-art algorithm on the tested sparse data-bases and up to 177 times on the tested dense databases. FEACP reduces memory usage by up to half over the CLH-Miner algorithm.(c) 2021 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available