4.7 Article

Large-scale flow simulations using lattice Boltzmann method with AMR following free-surface on multiple GPUs

Journal

COMPUTER PHYSICS COMMUNICATIONS
Volume 264, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.cpc.2021.107871

Keywords

Lattice Boltzmann method; Free-surface flow; Adaptive mesh refinement; GPU; Large-scale simulation

Funding

  1. KAKENHI from Japan Society for the Promotion of Science (JSPS) [JP26220002, JP19H05613]
  2. Joint Usage/Research Center for Interdisciplinary Largescale Information Infrastructures (JHPCN) , Japan [jh180034, jh180035]
  3. Japan Society for the Promotion of Science (JSPS) in Japan [JP17J09945]
  4. High Performance Computing Infrastructure (HPCI) , Japan' [hp190130]

Ask authors/readers for more resources

This paper presents a numerical method for large-scale free-surface flow simulations using the lattice Boltzmann method and multiple GPUs. By introducing the adaptive mesh refinement method, the number of lattice points can be greatly reduced, and dynamic domain partitioning using a space-filling curve helps maintain equal lattice points on each GPU.
Free-surface flow simulations require high-resolution grids to capture phenomena at the interface as well as a long computational time. In this paper, we propose a numerical method for realizing large-scale free-surface flow simulations using the lattice Boltzmann method and multiple GPUs. By introducing the adaptive mesh refinement (AMR) method, which adapts high-resolution grids to free surfaces, to the lattice Boltzmann method, the number of lattice points can be greatly reduced. In the calculation of the AMR method, the spatial distribution of a computational load changes with time; therefore, the number of lattice points assigned to each GPU is kept equal by dynamic domain partitioning using a space-filling curve. We measured the weak scalability of our AMR code on the TSUBAME3.0 supercomputer at the Tokyo Institute of Technology. By hiding GPU-GPU communication overheads by the overlapping method, the performance increased 1.29 times that of the naive implementation, and we achieved the fairly high performance of 14,570 MLUPS using 256 GPUs. We demonstrate large-scale simulations for the dam breaking problem and show a reduction in computational cost with the AMR method. (c) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available