4.6 Article

Acceleration of Bi-Objective Optimization of Data-Parallel Applications for Performance and Energy on Heterogeneous Hybrid Platforms

Journal

IEEE ACCESS
Volume 11, Issue -, Pages 27226-27245

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2023.3258684

Keywords

Heuristic algorithms; Optimization; Clustering algorithms; High performance computing; Graphics processing units; Energy consumption; Multicore processing; Energy management; High-performance heterogeneous computing; energy-efficient computing; bi-objective optimization; performance optimization; energy optimization; data-parallel applications; workload distribution

Ask authors/readers for more resources

Accelerating the bi-objective optimization of applications for performance and energy is crucial. This work highlights the challenges of accelerating model-based methods for bi-objective optimization of data parallel applications. The proposed algorithms provide significant speedups over state-of-the-art solutions.
Accelerating the bi-objective optimization of applications for performance and energy is crucial to achieving energy efficiency objectives and meeting quality-of-service requirements in modern high-performance computing platforms and cloud computing infrastructures. In this work, we highlight the crucial challenges to accelerate model-based methods proposed for the bi-objective optimization of data parallel applications for performance and energy that employ workload distribution between the executing processors as the decision variable. The methods solve unconstrained bi-objective optimization problems and take input, the processors' performance and energy profiles in the form of discrete functions of workload size, and output Pareto-optimal solutions (workload distributions), minimizing the execution time and the total energy consumption of computations during the parallel execution of the application. One of the challenges is the fast computation of Pareto-optimal solutions. We then formulate the bi-objective optimization problem of data-parallel applications for performance and energy through workload distribution on a cluster of p identical hybrid nodes, each containing h heterogeneous processors. The state-of-the-art algorithm for solving the problem is sequential and takes exorbitant execution times to find Pareto-optimal solutions for even moderate numbers of processors. We propose two algorithms that address this shortcoming. The first algorithm is an exact sequential algorithm that is more efficient and amenable to parallelization and achieves a complexity reduction of O(m x h) over the state-of-the-art sequential algorithm where m is the cardinality of the input discrete execution time and dynamic energy functions. The second algorithm is a parallel algorithm executed by q identical parallel processes that reduces the complexity of our proposed sequential algorithm by O(q). It, therefore, achieves a complexity reduction of O(m x h x q) over the state-of-the-art sequential algorithm. Finally, we experimentally analyze the practical efficacy of our proposed algorithms for two data-parallel applications, matrix multiplication and fast Fourier transform, on a heterogeneous hybrid node containing an Intel Haswell multicore CPU, an Nvidia k40c GPU, and an Nvidia P100 GPU and simulations of clusters of such hybrid nodes. The experiments demonstrate that our proposed algorithms provide tremendous speedups over state-of-the-art solutions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available