4.6 Article

Parallel frequent itemsets mining using distributed graphic processing units

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 81, Issue 30, Pages 43873-43895

Publisher

SPRINGER
DOI: 10.1007/s11042-022-13225-z

Keywords

Frequent itemset mining; Apriori; GPGPU; Distributed architecture; CUDA

Ask authors/readers for more resources

Data mining is an essential technique for pattern extraction and information classification. Association rule mining is an important data mining technique that extracts useful rules and knowledge by considering the relationships and association of the data. Parallelizing the mining process using multiple GPUs can significantly reduce execution time and improve efficiency.
Data mining is an essential technique in knowledge discovery which is widely used for pattern extraction and information classification. Extracting useful rules and knowledge by considering the relationships and association of the data is as an important data mining technique used for data analysis, called association rule mining (ARM). Several scans of the dataset are necessary to extract frequent patterns and association rules during a time-consuming process. Discovery of frequent patterns within data is the major phase of the ARM process, which is very expensive in terms of execution times. Powerful parallel systems with multiple graphics processing units (GPUs) and multiple general-purpose graphics processing units (GPGPUs) are appropriate choices to reduce the execution time. Although GPU architectures can speed up the mining process, a single GPU is usually unable to use a large amount of data to extract frequent patterns. It is therefore necessary to use multiple GPU processors on a system or distribute them within a network to improve the efficiency of parallelization. In this paper, multiple GPUs are parallelized to propose a new framework, called GPApbmp, for parallelization of the Apriori algorithm, which is a well-known level-wise frequent pattern mining method, for faster extraction of association rules. The proposed framework uses multiple GPUs, on which the dataset is distributed to reduce the execution time and the number of database scans in the Apriori method using a vertical approach. The experimental results on standard datasets show that the proposed method reduces the execution time speeds up the mining process. The results obtained from two and four parallelized NVidia GeForce 710 processors evaluated in CUDA.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available