4.7 Article

Parallelized domain decomposition for multi-dimensional Lagrangian random walk mass-transfer particle tracking schemes

Journal

GEOSCIENTIFIC MODEL DEVELOPMENT
Volume 16, Issue 3, Pages 833-849

Publisher

COPERNICUS GESELLSCHAFT MBH
DOI: 10.5194/gmd-16-833-2023

Keywords

-

Ask authors/readers for more resources

This article presents a parallelized domain decomposition strategy for mass-transfer particle tracking methods. Different tiling techniques for two and three dimensions are investigated and an optimal tiling is determined based on problem parameters and the number of CPU cores. The efficiency of the parallelized algorithms is demonstrated with simulations of a pure diffusion problem.
Lagrangian particle tracking schemes allow a wide range of flow and transport processes to be simulated accurately, but a major challenge is numerically implementing the inter-particle interactions in an efficient manner. This article develops a multi-dimensional, parallelized domain decomposition (DDC) strategy for mass-transfer particle tracking (MTPT) methods in which particles exchange mass dynamically. We show that this can be efficiently parallelized by employing large numbers of CPU cores to accelerate run times. In order to validate the approach and our theoretical predictions we focus our efforts on a well-known benchmark problem with pure diffusion, where analytical solutions in any number of dimensions are well established. In this work, we investigate different procedures for tiling the domain in two and three dimensions (2-D and 3-D), as this type of formal DDC construction is currently limited to 1-D. An optimal tiling is prescribed based on physical problem parameters and the number of available CPU cores, as each tiling provides distinct results in both accuracy and run time. We further extend the most efficient technique to 3-D for comparison, leading to an analytical discussion of the effect of dimensionality on strategies for implementing DDC schemes. Increasing computational resources (cores) within the DDC method produces a trade-off between inter-node communication and on-node work.For an optimally subdivided diffusion problem, the 2-D parallelized algorithm achieves nearly perfect linear speedup in comparison with the serial run-up to around 2700 cores, reducing a 5 h simulation to 8 s, while the 3-D algorithm maintains appreciable speedup up to 1700 cores.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available