☆ 3.8 Proceedings Paper

Improving performance of GMRES by reducing communication and pipelining global collectives

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) (2017)

期刊

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)

卷 -, 期 -, 页码 1118-1127

出版社

IEEE

DOI: 10.1109/IPDPSW.2017.65

关键词

类别

Computer Science, Hardware & Architecture Computer Science, Theory & Methods

资金

U.S. Department of Energy Office of Science [DE-FG0213ER26137, DE-SC0010042]
U.S. National Science Foundation [1339822]
U.S. Department of Energys National Nuclear Security Administration [DE-AC04-94AL85000]
U.S. Department of Energy (DOE) [DE-SC0010042] Funding Source: U.S. Department of Energy (DOE)
Direct For Computer & Info Scie & Enginr
Office of Advanced Cyberinfrastructure (OAC) [1339822] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

We compare the performance of pipelined and s-step GMRES, respectively referred to as l-GMRES and s-GMRES, on distributed-multicore CPUs. Compared to standard GMRES, s-GMRES requires fewer all-reduces, while l-GMRES overlaps the all-reduces with computation. To combine the best features of two algorithms, we propose another variant, (l, t)-GMRES, that not only does fewer global all-reduces than standard GMRES, but also overlaps those all-reduces with other work. We implemented the thread-parallelism and communication-overlap in two different ways. The first uses nonblocking MPI collectives with thread-parallel computational kernels. The second relies on a shared-memory task scheduler. In our experiments, (l, t)-GMRES performed better than l-GMRES by factors of up to 1.67x. In addition, though we only used 50 nodes, when the latency cost became significant, our variant performed up to 1.22x better than s-GMRES by hiding all-reduces.

Improving performance of GMRES by reducing communication and pipelining global collectives

期刊

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Improving performance of GMRES by reducing communication and pipelining global collectives

期刊

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文