4.6 Article

Deep Parallel Optimizations on an LASG/IAP Climate System Ocean Model and Its Large-Scale Parallelization

Journal

APPLIED SCIENCES-BASEL
Volume 13, Issue 4, Pages -

Publisher

MDPI
DOI: 10.3390/app13042690

Keywords

LICOM; meteorological model; parallel optimization

Ask authors/readers for more resources

This paper proposes a series of parallel optimizations on a high-resolution ocean model, achieving significant improvement in performance. The optimized model showed excellent performance in large-scale parallelization, achieving a speedup of more than two times compared with the original code. Additionally, the problem of jumpy wall time during the time integration process was tackled successfully.
This paper proposes a series of parallel optimizations on a high-resolution ocean model, the LASG/IAP Climate System Ocean Model (LICOM), which was independently developed by the Institute of Atmospheric Physics of the Chinese Academy of Sciences. The version of LICOM that we used was LICOM 2.1. In order to improve the parallel performance of LICOM, a series of parallel optimization methods were applied. We optimized the parallelization scheme to tackle the problem of load imbalance. Some communication optimizations were implemented, including data packing, the application of the least communication algorithm, and the replacement of communications with calculations. Furthermore, for the calculation procedures, we implemented some mature optimizations and expanded functions in a loop. Additionally, a hybrid of MPI and OpenMP, as well as an asynchronous parallel IO, was used. In this work, the optimized version of LICOM 2.1 was able to achieve a speedup of more than two times compared with the original code. The parallelization scheme optimization and the communication optimization produced considerable improvement in performance in the large-scale parallelization. Meanwhile, the newly optimized LICOM could scale up to 245,760 processor cores. However, for the original version, there was no speedup when scaled up to over 10,000 processor cores. Additionally, the problem of jumpy wall time during the time integration process was also tackled with this optimization. Finally, we conducted a practical simulation from 1993 to 2007 by using the optimized version of LICOM 2.1. The results showed that the mesoscale vortex was well simulated by the model.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available