4.7 Article

Shared-memory block-based fast marching method for hierarchical meshes

Journal

Publisher

ELSEVIER
DOI: 10.1016/j.cam.2021.113488

Keywords

Fast marching method; Eikonal equation; Level-set method; Re-distancing; Hierarchical meshes; Shared-memory parallelization

Funding

  1. Austrian Federal Ministry for Digital and Economic Affairs
  2. National Foundation for Research, Technology and Development, Austria
  3. TU Wien Bibliothek, Austria

Ask authors/readers for more resources

In this study, the multi-mesh fast marching method is extended by a block-based decomposition step to enhance serial and parallel performance on hierarchical meshes. The approach offers improved load balancing with a high mesh partitioning degree, effectively balancing mesh partitions with varying sizes. Various benchmarks and parameter studies are conducted on representative geometries with different complexities, resulting in increased serial performance and achieved parallel speedups on a 24-core Intel Skylake computing platform.
The fast marching method is commonly used in expanding front simulations in various fields, such as, fluid dynamics, computer graphics, and in microelectronics, to restore the signed-distance field property of the level-set function, also known as re-distancing. To improve the performance of the re-distancing step, parallel algorithms for the fast marching method as well as support for hierarchical meshes have been developed; the latter to locally support higher resolutions of the simulation domain whilst limiting the impact on the overall computational demand. In this work, the previously developed multi-mesh fast marching method is extended by a so-called block-based decomposition step to improve serial and parallel performance on hierarchical meshes. OpenMP tasks are used for the underlying coarse-grained parallelization on a per mesh basis. The developed approach offers improved load balancing as the algorithm employs a high mesh partitioning degree, enabling to balance mesh partitions with varying mesh sizes. Various benchmarks and parameter studies are performed on representative geometries with varying complexities. The serial performance is increased by up to 21% whereas parallel speedups ranging from 7.4 to 19.1 for various test cases on a 24-core Intel Skylake computing platform have been achieved, effectively doubling the parallel efficiency of the previous approach. (C) 2021 The Author(s). Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available