4.7 Article

PaScaL_TDMA 2.0: A multi-GPU-based library for solving massive tridiagonal systems

Journal

COMPUTER PHYSICS COMMUNICATIONS
Volume 290, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.cpc.2023.108785

Keywords

CUDA; GPU computing; Multi-GPU; Tridiagonal matrix systems

Ask authors/readers for more resources

We introduce an updated library, PaScaL_TDMA 2.0, which is capable of exploiting multi-GPU environments. The library extends its functionality to include GPU support and minimizes CPU-GPU data transfer by utilizing device-resident memory while retaining the original CPU-based capabilities. The library employs pipeline copying with shared memory for low-latency memory access and incorporates CUDA-aware MPI for efficient multi-GPU communication. Our GPU implementation demonstrated outstanding computational performance compared to the original CPU implementation while consuming much less energy.
We introduce an updated library, PaScaL_TDMA 2.0, which was originally designed for the efficient computation of batched tridiagonal systems and is now capable of exploiting multi-GPU environments. The library extends its functionality to include GPU support and minimizes CPU-GPU data transfer by utilizing the device-resident memory while retaining the original CPU-based capabilities. The library employs pipeline copying with shared memory for low-latency memory access and incorporates CUDA-aware MPI for efficient multi-GPU communication. Our GPU implementation demonstrated outstanding computational performance compared to the original CPU implementation while consuming much less energy. In summary, this updated version presents a time-efficient and energy-saving approach for solving batched tridiagonal systems on modern computing platforms, including both GPU and CPU.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available