4.7 Article

FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 111, Issue -, Pages 283-298

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.knosys.2016.08.022

Keywords

High-utility itemset mining; Transaction database; Negative unit profit; PNU-list; Pruning strategies

Funding

  1. National Science and Engineering Research Council (NSERC) of Canada
  2. Tencent Project [CCF-TencentRAGR20140114]

Ask authors/readers for more resources

High utility itemset mining is an emerging data mining task, which consists of discovering highly profitable itemsets (called high utility itemsets) in very large transactional databases. Many algorithms have been proposed to efficiently discover high utility itemsets but most of them assume that items may only have positive unit profits. However, in real-world transactional databases, items (products) often have positive or negative unit profits. Mining high utility itemsets in a transactional database where items have positive or negative unit profits is a computationally expensive task, and it is thus desirable to design more efficient algorithms. To address this issue, we propose an efficient algorithm named FHN (Faster High-Utility itemset miner with Negative unit profits). It relies on a novel PNU-list structure (Positive and-Negative Utility-list) structure to efficiently mine high utility itemsets, while considering both positive and negative unit profits. Moreover, several pruning strategies are introduced in FHN to reduce the number of candidate itemsets, and thus enhance the performance of FHN. Extensive experimental results on both real-life and synthetic datasets show that the proposed FHN algorithm is in general two to three orders of magnitude faster and can use up to 200 times less memory than the state-of-the-art algorithm HUINIV-Mine. Moreover, it is shown that FHN performs especially well on dense datasets. (C) 2016 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available