4.7 Article

Parallel computing efficiency of SWAN 40.91

期刊

GEOSCIENTIFIC MODEL DEVELOPMENT
卷 14, 期 7, 页码 4241-4247

出版社

COPERNICUS GESELLSCHAFT MBH
DOI: 10.5194/gmd-14-4241-2021

关键词

-

资金

  1. National Research Foundation of South Africa [116359]

向作者/读者索取更多资源

Effective and accurate ocean and coastal wave predictions are crucial for engineering, safety, and recreational purposes. The study found that a computational node configuration of six threads/cores produced the most effective computational set-up for 1-week wave hindcasts. Further research is needed to understand the relationship between computational domain size and optimal parallel computational threads/cores for efficient simulations.
Effective and accurate ocean and coastal wave predictions are necessary for engineering, safety and recreational purposes. Refining predictive capabilities is increasingly critical to reduce the uncertainties faced with a changing global wave climatology. Simulating WAves in the Nearshore (SWAN) is a widely used spectral wave modelling tool employed by coastal engineers and scientists, including for operational wave forecasting purposes. Fore- and hind-casts can span hours to decades, and a detailed understanding of the computational efficiencies is required to design optimized operational protocols and hindcast scenarios. To date, there exists limited knowledge on the relationship between the size of a SWAN computational domain and the optimal amount of parallel computational threads/cores required to execute a simulation effectively. To test the scalability, a hindcast cluster of 28 computational threads/cores (1 node) was used to determine the computation efficiencies of a SWAN model configuration for southern Africa. The model extent and resolution emulate the current operational wave forecasting configuration developed by the South African Weather Service (SAWS). We implemented and compared both OpenMP and the Message Passing Interface (MPI) distributing memory architectures. Three sequential simulations (corresponding to typical grid cell numbers) were compared to various permutations of parallel computations using the speed-up ratio, time-saving ratio and efficiency tests. Generally, a computational node configuration of six threads/cores produced the most effective computational set-up based on wave hindcasts of 1-week duration. The use of more than 20 threads/cores resulted in a decrease in speed-up ratio for the smallest computation domain, owing to the increased sub-domain communication times for limited domain sizes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据