4.7 Article

An efficient method for mining high occupancy itemsets based on equivalence class and early pruning

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 267, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2023.110441

Keywords

High occupancy itemset; Occupancy pattern; Data mining; Equivalence class; Pruning candidates

Ask authors/readers for more resources

Many researchers are exploring and using a new trend in data mining called high occupancy itemset mining. This research applies an occupancy measure to a support-based mining framework, bringing benefits to decision support systems and enabling managers to visualize reports and analyze data more efficiently.
Many researchers have been investigating and applying a new trend of data mining, namely high occupancy itemset mining. Frequent itemset mining often returns a large set of itemsets, but businesses need a smaller set of inputs to investigate or send into a recommendation system to quickly make decisions. Applying an occupancy measure to a support-based mining framework will thus bring many benefits for decision support systems, while managers will benefit by having a new method to visualize reports and analyze data more efficiently. Similar to frequent itemset mining, mining high occupancy itemsets can be applied on any transaction database. In this research, we apply additional conditions to eliminate unqualified itemsets and integrate the property of equivalence class to reduce the runtime of the k-itemsets generation process. Moreover, a new theorem is stated and applied to a specific class of databases so that it is not necessary to calculate the upper-bound occupancy, and this speeds up the process as well as reduces memory requirements with regard to generating high occupancy itemsets. We develop two new algorithms, fast high occupancy itemset mining (FHOI) and depth first search (DFS) for high occupancy itemset mining (DFHOI) to solve the problem. Our new algorithms are examined experimentally using different databases to evaluate its performance in term of runtime and memory usage.(c) 2023 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available