4.5 Article

Parallel QR Factorization of Block Low-rank Matrices

期刊

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3538647

关键词

Block low-rank matrix; QR factorization; householder reflections; task-based execution

资金

  1. JSPS KAKENHI [JP20K20624, JP21H03447]
  2. Joint Usage/Research Center for Interdisciplinary Large-scale Information Infrastructures
  3. High Performance Computing Infrastructure in Japan [jh210024-NAHI]

向作者/读者索取更多资源

The study introduces two new algorithms for Householder QR factorization of Block Low-Rank (BLR) matrices, achieving arithmetic complexities of O(mn) and O(mn(1.5)) respectively. The algorithms are compared in terms of parallel task-based execution and performance, showing significant speedups compared to dense QR methods. The study also demonstrates robustness to ill conditioning and better orthogonal factors compared to existing methods.
We present two new algorithms for Householder QR factorization of Block Low-Rank (BLR) matrices: one that performs block-column-wise QR and another that is based on tiled QR. We show how the block-column-wise algorithm exploits BLR structure to achieve arithmetic complexity of O(mn), while the tiled BLR-QR exhibits O(mn(1.5)) complexity. However, the tiled BLR-QR has finer task granularity that allows parallel task-based execution on shared memory systems. We compare the block-column-wise BLR-QR using fork-join parallelism with tiled BLR-QR using task-based parallelism. We also compare these two implementations of Householder BLR-QR with a block-column-wise Modified Gram-Schmidt (MGS) BLR-QR using fork-join parallelism and a state-of-the-art vendor-optimized dense Householder QR in Intel MKL. For a matrix of size 131k x 65k, all BLR methods are more than an order of magnitude faster than the dense QR in MKL. Our methods are also robust to ill conditioning and produce better orthogonal factors than the existing MGS-based method. On a CPU with 64 cores, our parallel tiled Householder and block-column-wise Householder algorithms show a speedup of 50 and 37 times, respectively.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据