4.5 Article

Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUs

Journal

IEEE TRANSACTIONS ON COMPUTERS
Volume 64, Issue 9, Pages 2623-2636

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TC.2014.2366731

Keywords

GPU; matrix partition; multicore CPU; probability distribution; sparse matrix-vector multiplication

Funding

  1. Key Program of National Natural Science Foundation of China [61133005, 61432005]
  2. National Natural Science Foundation of China [61370095, 61472124]
  3. Hunan Science and Technology Commission [2013sk2014]
  4. Scientific Research Fund of Hunan Provincial Education Department [13A011]

Ask authors/readers for more resources

This paper presents a sparse matrix partitioning strategy to improve the performance of SpMV on GPUs and multicore CPUs. This method has wide adaptability for different types of sparse matrices, and is different from existing methods which only adapt to some particular sparse matrices. In addition, our partitioning method can obtain dense blocks by analyzing the probability distribution of non-zero elements in a sparse matrix, and result in very low proportion of zero padded. We make the following significant contributions. (1) We present a partitioning strategy of sparse matrices based on probabilistic modeling of non-zero elements in a row. (2) We prove that our method has the highest mean density compared with other strategies according to certain given ratios of partition obtained from the computing powers of heterogeneous processors. (3) We develop a CPU-GPU hybrid parallel computing model for SpMV on GPUs and multicore CPUs in a heterogeneous computing platform. Our partitioning strategy has balanced load distribution and the performance of SpMV is significantly improved when a sparse matrix is partitioned into dense blocks using our method. The average performance improvement of our solution for SpMV is about 15.75 percent on multicore CPUs, compared to that of the other solutions. By considering the rows of a matrix in a unique order based on the probability mass function of the number of non-zeros in a row, the average performance improvement of our solution for SpMV is about 33.52 percent on GPUs and multicore CPUs of a heterogeneous computing platform, compared to that of the partitioning methods based on the original row order of a matrix.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available