4.4 Article

A new diagonal storage for efficient implementation of sparse matrix-vector multiplication on graphics processing unit

Related references

Note: Only part of the references are listed.
Article Computer Science, Theory & Methods

A thread-adaptive sparse approximate inverse preconditioning algorithm on multi-GPUs

Jiaquan Gao et al.

Summary: This study introduces an efficient sparse approximate inverse preconditioning algorithm, GSPAI-Adaptive, on multiple GPUs. It presents a thread-adaptive allocation strategy for constructing the preconditioner and computes each component of the preconditioner in parallel inside a thread group of GPU, showing advantages over popular preconditioning algorithms and a latest parallel sparse approximate inverse preconditioning algorithm in experimental results.

PARALLEL COMPUTING (2021)

Article Computer Science, Software Engineering

An efficient sparse approximate inverse preconditioning algorithm on GPU

Guixia He et al.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE (2020)

Article Computer Science, Software Engineering

Efficient dense matrix-vector multiplication on GPU

Guixia He et al.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE (2018)

Article Computer Science, Theory & Methods

Incomplete Sparse Approximate Inverses for Parallel Preconditioning

Hartwig Anzt et al.

PARALLEL COMPUTING (2018)

Article Computer Science, Software Engineering

A novel multi-graphics processing unit parallel optimization framework for the sparse matrix-vector multiplication

Jiaquan Gao et al.

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE (2017)

Article Computer Science, Theory & Methods

Adaptive Optimization l1-Minimization Solvers on GPU

Jiaquan Gao et al.

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING (2017)

Article Mathematics, Applied

GPU-accelerated preconditioned GMRES method for two-dimensional Maxwell's equations

Jiaquan Gao et al.

INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS (2017)

Article Computer Science, Theory & Methods

A multi-GPU parallel optimization model for the preconditioned conjugate gradient algorithm

Jiaquan Gao et al.

PARALLEL COMPUTING (2017)

Article Computer Science, Software Engineering

Sparse Matrix-Vector Multiplication on GPGPUs

Salvatore Filippone et al.

ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE (2017)

Article Computer Science, Theory & Methods

Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs

Jakub Kurzak et al.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2016)

Article Computer Science, Theory & Methods

Performance Analysis and Optimization for SpMV on GPU Using Probabilistic Modeling

Kenli Li et al.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2015)

Article Mathematics, Applied

AMGX: A LIBRARY FOR GPU ACCELERATED ALGEBRAIC MULTIGRID AND PRECONDITIONED ITERATIVE METHODS

M. Naumov et al.

SIAM JOURNAL ON SCIENTIFIC COMPUTING (2015)

Article Computer Science, Theory & Methods

A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs

Ping Guo et al.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2014)

Article Computer Science, Theory & Methods

Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU

Jiaquan Gao et al.

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING (2014)

Article Computer Science, Software Engineering

The University of Florida Sparse Matrix Collection

Timothy A. Davis et al.

ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE (2011)