4.7 Article

A new AXT format for an efficient SpMV product using AVX-512 instructions and CUDA

期刊

ADVANCES IN ENGINEERING SOFTWARE
卷 156, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.advengsoft.2021.102997

关键词

Sparse Matrix Vector product; AVX-512 instructions; MKL Library; CUDA; cuSPARSE Library; Segmented Scan algorithm

资金

  1. FEDER funds
  2. Xunta de Galicia [ED431C 2018/19, ED431F 2020/008]
  3. Spanish Ministry of Science and Technology [PID2019-104834GB-I00, TIN2016-76373-P]
  4. Project HPC-EUROPA3 [INFRAIA2016-1-730897]
  5. EC Research Innovation Action under the H2020 Programme
  6. Irish Centre for High-End Computing (ICHEC)

向作者/读者索取更多资源

The paper introduces a new sparse matrix storage format AXT, which improves SpMV performance on vector capability platforms. By optimizing different subvariants of AXT and comparing performance on Intel and NVIDIA platforms, it is shown that AXT outperforms AXC and CSR significantly.
The Sparse Matrix-Vector (SpMV) product is a key operation used in many scientific applications. This work proposes a new sparse matrix storage scheme, the AXT format, that improves the SpMV performance on vector capability platforms. AXT can be adapted to different platforms, improving the storage efficiency for matrices with different sparsity patterns. Intel AVX-512 instructions and CUDA are used to optimise the performances of the four different AXT subvariants. Performance comparisons are made with the Compressed Sparse Row (CSR) and AXC formats on an Intel Xeon Gold 6148 processor and an NVIDIA Tesla V100 Graphics Processing Units using 26 matrices. On the Intel platform the overall AXT performance is 18% and 44.3% higher than the AXC and CSR respectively, reaching speed-up factors of up to x7.33. On the NVIDIA platform the AXT performance is 44% and 8% higher than the AXC and CSR performances respectively, reaching speed-up factors of up to x378.5.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据