4.6 Article

Big optimization with genetic algorithms: Hadoop, Spark, and MPI

期刊

SOFT COMPUTING
卷 27, 期 16, 页码 11469-11484

出版社

SPRINGER
DOI: 10.1007/s00500-023-08301-x

关键词

Big optimization; Genetic algorithms; MapReduce; Hadoop; Spark; MPI

向作者/读者索取更多资源

This article discusses the use of MapReduce as a computing paradigm to solve large-scale combinatorial optimization problems, focusing on the potential and advantages of developing genetic algorithms using Hadoop, Spark, and MPI as middleware platforms. The results show that MRGA performs better on the Hadoop framework compared to Spark and MPI when dealing with high-dimensional datasets.
Solving problems of high dimensionality (and complexity) usually needs the intense use of technologies, like parallelism, advanced computers and new types of algorithms. MapReduce (MR) is a computing paradigm long time existing in computer science that has been proposed in the last years for dealing with big data applications, though it could also be used for many other tasks. In this article, we address big optimization: the solution to large instances of combinatorial optimization problems by using MR as the paradigm to design solvers that allow transparent runs on a varied number of computers that collaborate to find the problem solution. We study and analyze the MR technology, focusing on Hadoop, Spark, and MPI as the middleware platforms to develop genetic algorithms (GAs). From this, MRGA solvers arise using a different programming paradigm from the usual imperative transformational programming. Our objective is to confirm the expected benefits of these systems, namely file, memory, and communication management, over the resulting algorithms. We analyze our MRGA solvers from relevant points of view like scalability, speedup, and communication vs. computation time in big optimization. The results for high-dimensional datasets show that the MRGA over Hadoop outperforms the implementations in Spark and MPI frameworks. For the smallest datasets, the execution of MRGA on MPI is always faster than the executions of the remaining MRGAs. Finally, the MRGA over Spark presents the lowest communication times. Numerical and time insights are given in our work, so as to ease future comparisons of new algorithms over these three popular technologies.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据