4.7 Article

Accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit, and customized 16-bit number formats

期刊

PHYSICAL REVIEW E
卷 106, 期 1, 页码 -

出版社

AMER PHYSICAL SOC
DOI: 10.1103/PhysRevE.106.015308

关键词

-

资金

  1. BZHPC
  2. LRZ
  3. CINECA
  4. JSC JURECA-DC Evaluation Platform
  5. Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) [SFB 1357-391977956]
  6. Deutsche Forschungsgemeinschaft [FOR 2688, B3 (417989940), B2 (417989464)]

向作者/读者索取更多资源

This study evaluates the feasibility of using different precision optimization algorithms in lattice Boltzmann method and finds that the accuracy difference between FP32 and 16-bit precision can be neglected in most cases.
Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very memory intensive. Alongside reduction in memory footprint, significant performance benefits can be achieved by using FP32 (single) precision compared to FP64 (double) precision, especially on GPUs. Here we evaluate the possibility to use even FP16 and posit16 (half) precision for storing fluid populations, while still carrying arithmetic operations in FP32. For this, we first show that the commonly occurring number range in the LBM is a lot smaller than the FP16 number range. Based on this observation, we develop customized 16-bit formats-based on a modified IEEE-754 and on a modified posit standard-that are specifically tailored to the needs of the LBM. We then carry out an in-depth characterization of LBM accuracy for six different test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow (utilizing the immersed-boundary method), and, finally, the impact of a raindrop (based on a volume-of-fluid approach). We find that the difference in accuracy between FP64 and FP32 is negligible in almost all cases, and that for a large number of cases even 16-bit is sufficient. Finally, we provide a detailed performance analysis of all precision levels on a large number of hardware microarchitectures and show that significant speedup is achieved with mixed FP32/16-bit.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据