4.3 Article

Sparse matrix multiplication: The distributed block-compressed sparse row library

期刊

PARALLEL COMPUTING
卷 40, 期 5-6, 页码 47-58

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/j.parco.2014.03.012

关键词

Sparse matrix; Parallel sparse matrix multiplication; Quantum chemistry

资金

  1. Swiss University Conference through the High Performance and High Productivity Computing (HP2C) Programme
  2. European Union FP7 in the form of an ERC Starting Grant [277910]
  3. European Research Council (ERC) [277910] Funding Source: European Research Council (ERC)

向作者/读者索取更多资源

Efficient parallel multiplication of sparse matrices is key to enabling many large-scale calculations. This article presents the DBCSR (Distributed Block Compressed Sparse Row) library for scalable sparse matrix matrix multiplication and its use in the CP2K program for linear-scaling quantum-chemical calculations. The library combines several approaches to implement sparse matrix multiplication in a way that performs well and is demonstrably scalable. Parallel communication has well-defined limits. Data volume decreases with O(1/root P) with increasing process counts P and every process communicates with at most O(/root P) others. Local sparse matrix multiplication is handled efficiently using a combination of techniques: blocking elements together in an application-relevant way, an autotuning library for small matrix multiplications, cache-oblivious recursive multiplication, and multithreading. Additionally, on-the-fly filtering not only increases sparsity but also avoids performing calculations that fall below the filtering threshold. We demonstrate and analyze the performance of the DBCSR library and its various scaling behaviors. (C) 2014 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据