☆ 4.7 Article

On the Efficient Representation of Datasets as Graphs to Mine Maximal Frequent Itemsets

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2021)

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

卷 33, 期 4, 页码 1674-1691

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TKDE.2019.2945573

关键词

Itemsets; Data mining; Databases; Data structures; Task analysis; Benchmark testing; Machine intelligence; Efficient frequent itemsets extraction; efficient data structure; graph utility; maximal frequent itemsets

类别

Computer Science, Artificial Intelligence Computer Science, Information Systems Engineering, Electrical & Electronic

资金

GIK Institute graduate programresearch fund under PSS scheme

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This research introduces a graph-based approach to represent transactional databases, storing all information relevant to mining FIs in one pass, along with an algorithm for extracting FIs from this structure. Experimental results demonstrate that the proposed approach outperforms other methods in terms of time efficiency.

Frequent itemsets mining is an active research problem in the domain of data mining and knowledge discovery. With the advances in database technology and an exponential increase in data to be stored, there is a need for efficient approaches that can quickly extract useful information from such large datasets. Frequent Itemsets (FIs) mining is a data mining task to find itemsets in a transactional database which occur together above a certain frequency. Finding these FIs usually requires multiple passes over the databases; therefore, making efficient algorithms crucial for mining FIs. This work presents a graph-based approach for representing a complete transactional database. The proposed graph-based representation enables the storing of all relevant information (for extracting FIs) of the database in one pass. Later, an algorithm that extracts the FIs from the graph-based structure is presented. Experimental results are reported comparing the proposed approach with 17 related FIs mining methods using six benchmark datasets. Results show that the proposed approach performs better than others in terms of time.

On the Efficient Representation of Datasets as Graphs to Mine Maximal Frequent Itemsets

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

On the Efficient Representation of Datasets as Graphs to Mine Maximal Frequent Itemsets

期刊

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文