☆ 4.7 Article

GPU implementation of the discrete unified gas kinetic scheme for low-speed isothermal flows

COMPUTER PHYSICS COMMUNICATIONS (2024)

期刊

COMPUTER PHYSICS COMMUNICATIONS

卷 294, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.cpc.2023.108908

关键词

DUGKS; Multi-scale flows; GPU acceleration; CUDA

类别

Computer Science, Interdisciplinary Applications Physics, Mathematical

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes two GPU parallel algorithms for simulating low-speed isothermal flows using the discrete unified gas kinetic scheme (DUGKS). The performance of the algorithms is evaluated through simulations of benchmark problems, and the results show satisfactory computational efficiency. The algorithms have different performance in different scenarios.

In this paper, two GPU parallel algorithms are proposed for the discrete unified gas kinetic scheme (DUGKS) for simulating low-speed isothermal flows. Algorithm-I uses a two-level fine-grain technique for the parallelization of physical spatial space, while Algorithm-II adopts this technique for both physical spatial and particle velocity spaces. To evaluate the performance of the proposed algorithms, several typical benchmark problems are simulated, including the two-dimensional (2D) and three-dimensional (3D) lid-driven cavity flows, the micro channel and cavity flows. Numerical results show that our GPU algorithms can achieve satisfactory computational efficiency. For Algorithm-I, the speedup can reach 250 and 338 on a Tesla V100 GPU card for the 2D and 3D continuum cavity flows, respectively, and a hundredfold acceleration can be obtained for the rarefied cases. While for Algorithm-II, a speedup of about 70 can be attained for rarefied cases. However, it is not applied to continuum problems that only require a small number of velocity points. Moreover, comparisons between the two GPU algorithms are also conducted for the rarefied flows with various grid meshes and velocity directions. The results show that Algorithm-I performs better when physical mesh size is large, while Algorithm-II can provide higher efficiency for a coarser mesh with medium number of discrete velocities. Special attention is also paid to comparisons between Algorithm-I and MPI parallelization with 128 CPU cores based on physical space discretization approach, and it is found that Algorithm-I has a clear advantage on V100 GPU when dealing with sparse physical grids in both continuum and rarefied cases. (c) 2023 Elsevier B.V. All rights reserved.

GPU implementation of the discrete unified gas kinetic scheme for low-speed isothermal flows

期刊

COMPUTER PHYSICS COMMUNICATIONS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

GPU implementation of the discrete unified gas kinetic scheme for low-speed isothermal flows

期刊

COMPUTER PHYSICS COMMUNICATIONS

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文