4.5 Article

An improved framework of GPU computing for CFD applications on structured grids using OpenACC

Journal

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING
Volume 156, Issue -, Pages 64-85

Publisher

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jpdc.2021.05.010

Keywords

MPI; OpenACC; CFD; Performance optimization; Structured grid

Ask authors/readers for more resources

This study focuses on optimizing the performance of a CFD code on structured grids with multiple GPUs. By applying various optimizations, the performance is improved, and using 16 P100 GPUs and 16 V100 GPUs can be up to 30x and 90x faster than 16 Xeon CPU E5-2680v4 cores.
This work is focused on improving multi-GPU performance of a research CFD code on structured grids. MPI and PGI 18.1 OpenACC directives are used to scale the code up to 16 NVIDIA GPUs. This work shows that using 16 P100 GPUs and 16 V100 GPUs can be up to 30xand 90xfaster than 16 Xeon CPU E5-2680v4 cores for three different test cases, respectively. A series of performance issues related to the scaling for the multi-block CFD code are addressed by applying various optimizations. Performance optimizations such as the pack/unpack message method, removing temporary arrays as arguments to procedure calls, allocating global memory for limiters and connected boundary data, reordering nonblocking MPI I_send/I_recv and blocking Wait calls, reducing unnecessary implicit derived type member data movement between the host and the device and the use of GPUDirect can improve the compute utilization, memory throughput, and asynchronous progression in the multi-block CFD code using modern programming features. (c) 2021 Elsevier Inc. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available