☆ 4.7 Article

Heterogeneous CPU plus GPU parallelization for high-accuracy scale-resolving simulations of compressible turbulent flows on hybrid supercomputers

COMPUTER PHYSICS COMMUNICATIONS (2022)

期刊

COMPUTER PHYSICS COMMUNICATIONS

卷 271, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.cpc.2021.108231

关键词

Scale-resolving simulation; Unstructured mesh; Heterogeneous computing; CPU plus GPU; MPI plus OpenMP plus OpenCL; Hybrid supercomputer

类别

Computer Science, Interdisciplinary Applications Physics, Mathematical

资金

Russian Science Foundation [19-11-00299]
Russian Science Foundation [19-11-00299] Funding Source: Russian Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper presents a heterogeneous parallel algorithm and its software implementation for simulating compressible turbulent flows. The algorithm is based on a family of higher accuracy edge-based reconstruction schemes on unstructured mixed-element meshes. The parallel solution can utilize a large number of computing devices on various computing architectures, including manycore CPUs and GPUs. The paper provides a detailed description of the parallel algorithm and its efficient implementation, as well as demonstrations of parallel performance on different supercomputers.

A heterogeneous parallel algorithm for simulation of compressible turbulent flows and its portable software implementation are presented. The underlying numerical method is based on a family of higher accuracy edge-based reconstruction schemes on unstructured mixed-element meshes. The proposed parallel solution can engage a large number of computing devices of most of the existing computing architectures used in modern supercomputers, including manycore CPUs and GPUs. It is capable of co-execution on both CPUs and accelerators simultaneously. The multilevel parallel algorithm combines: MPI for distributing workload among hybrid cluster nodes and between devices inside nodes; OpenMP for manycore CPUs and other supporting devices, such as Intel Xeon Phi; OpenCL for massively-parallel accelerators, such as GPUs of various vendors, including NVIDIA, AMD, Intel. The main focus is on the adaptation of the numerical method and its computational algorithm to the stream processing parallel paradigm. The very limited device memory inherent in GPU computing is also taken into account. A detailed description of the parallel algorithm is presented, as well as the techniques used for its efficient parallel implementation. Special attention is paid to implicit time integration with its linear solver and calculation of convective fluxes and viscous terms. The use of mixed floating-point precision and overlapping communications and computations is also discussed. Parallel performance is demonstrated in practical applications on different kinds of supercomputers using up to 10 thousand cores and multiple GPUs of comparable overall performance. (C) 2021 Elsevier B.V. All rights reserved.

Heterogeneous CPU plus GPU parallelization for high-accuracy scale-resolving simulations of compressible turbulent flows on hybrid supercomputers

期刊

COMPUTER PHYSICS COMMUNICATIONS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Heterogeneous CPU plus GPU parallelization for high-accuracy scale-resolving simulations of compressible turbulent flows on hybrid supercomputers

期刊

COMPUTER PHYSICS COMMUNICATIONS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文