4.5 Article

Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA)

期刊

JOURNAL OF SUPERCOMPUTING
卷 64, 期 3, 页码 942-967

出版社

SPRINGER
DOI: 10.1007/s11227-011-0672-7

关键词

Parallel computing; CUDA; Data mining; Classification; Clustering; Association rules mining

资金

  1. Natural Science Foundation of China [70621001/70921061, 70531040]
  2. NVIDIA's Professor Partnership
  3. Graduate University of Chinese Academy of Sciences [085102 GNOO, 085102 HNOO]
  4. Chinese Academy of Sciences

向作者/读者索取更多资源

Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Compute Unified Device Architecture (CUDA) programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. Data mining is widely used and has significant applications in various domains. However, current data mining toolkits cannot meet the requirement of applications with large-scale databases in terms of speed. In this paper, we propose three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme. They play a key role in our CUDA-based implementation of three representative data mining algorithms, CU-Apriori, CU-KNN, and CU-K-means. These parallel implementations outperform the other state-of-the-art implementations significantly on a HP xw8600 workstation with a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that GPU + CUDA parallel architecture is feasible and promising for data mining applications.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据