4.5 Article

Low synchronization Gram-Schmidt and generalized minimal residual algorithms

期刊

出版社

WILEY
DOI: 10.1002/nla.2343

关键词

Graphics processing unit; Gram– Schmidt process; Krylov methods; massively parallel; scalable GMRES; WY factorization

资金

  1. Exascale Computing Project [17-SC-20-SC]
  2. U.S. Department of Energy [DE-AC36-08GO28308, DE-AC52-07NA27344]
  3. NSF [1645514]
  4. Direct For Computer & Info Scie & Enginr
  5. Division of Computing and Communication Foundations [1645514] Funding Source: National Science Foundation

向作者/读者索取更多资源

The Gram-Schmidt process uses orthogonal projection to construct the QR factorization of a matrix, and approximate projections with form P = I - QTQ(T) can improve orthogonality. New variants of modified Gram-Schmidt algorithms introduce a compact WY representation.
The Gram-Schmidt process uses orthogonal projection to construct the A = QR factorization of a matrix. When Q has linearly independent columns, the operator P = I - Q(Q(T)Q)(-1)Q(T) defines an orthogonal projection onto Q(perpendicular to). In finite precision, Q loses orthogonality as the factorization progresses. A family of approximate projections is derived with the form P = I - QTQ(T), with correction matrix T. When T = (Q(T)Q)(-1), and T is triangular, it is postulated that the best achievable orthogonality is O(epsilon)kappa(A). We present new variants of modified (MGS) and classical Gram-Schmidt algorithms that require one global reduction step. An interesting form of the projector leads to a compact WY representation for MGS. In particular, the inverse compact WY MGS algorithm is equivalent to a lower triangular solve. Our main contribution is to introduce a backward normalization lag into the compact WY representation, resulting in a O(epsilon)kappa([r0,AVm]) stable Generalized Minimal Residual Method (GMRES) algorithm that requires only one global reduce per iteration. Further improvements in performance are achieved by accelerating GMRES on GPUs.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据