☆ 3.9 Article

Esoteric Pull and Esoteric Push: Two Simple In-Place Streaming Schemes for the Lattice Boltzmann Method on GPUs

COMPUTATION (2022)

Journal

COMPUTATION

Volume 10, Issue 6, Pages -

Publisher

MDPI

DOI: 10.3390/computation10060092

Keywords

lattice Boltzmann method; GPU; in-place streaming; swap algorithm; Esoteric Twist; memory; memory bandwidth; Volume-of-Fluid; FluidX3D; OpenCL

Funding

Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) [391977956-SFB 1357]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study introduces two novel thread-safe in-place streaming schemes for the lattice Boltzmann method on GPUs. These schemes reduce memory demand by only requiring one copy of the density distribution functions. They improve performance through optimized memory coalescence and are compatible with different devices and automatic code generation.

I present two novel thread-safe in-place streaming schemes for the lattice Boltzmann method (LBM) on graphics processing units (GPUs), termed Esoteric Pull and Esoteric Push, that result in the LBM only requiring one copy of the density distribution functions (DDFs) instead of two, greatly reducing memory demand. These build upon the idea of the existing Esoteric Twist scheme, to stream half of the DDFs at the end of one stream-collide kernel and the remaining half at the beginning of the next, and offer the same beneficial properties over the AA-Pattern scheme-reduced memory bandwidth due to implicit bounce-back boundaries and the possibility of swapping pointers between even and odd time steps. However, the streaming directions are chosen in a way that allows the algorithm to be implemented in about one tenth the amount of code, as two simple loops, and is compatible with all velocity sets and suitable for automatic code-generation. The performance of the new streaming schemes is slightly increased over Esoteric Twist due to better memory coalescence. Benchmarks across a large variety of GPUs and CPUs show that for most dedicated GPUs, performance differs only insignificantly from the One-Step Pull scheme; however, for integrated GPUs and CPUs, performance is significantly improved. The two proposed algorithms greatly facilitate modifying existing code to in-place streaming, even with extensions already in place, such as demonstrated here for the Free Surface LBM implementation FluidX3D. Their simplicity, together with their ideal performance characteristics, may enable more widespread adoption of in-place streaming across LBM GPU codes.

Esoteric Pull and Esoteric Push: Two Simple In-Place Streaming Schemes for the Lattice Boltzmann Method on GPUs

Journal

COMPUTATION

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Esoteric Pull and Esoteric Push: Two Simple In-Place Streaming Schemes for the Lattice Boltzmann Method on GPUs

Journal

COMPUTATION

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper