Journal
JOURNAL OF CHEMICAL THEORY AND COMPUTATION
Volume -, Issue -, Pages -Publisher
AMER CHEMICAL SOC
DOI: 10.1021/acs.jctc.2c00274
Keywords
-
Funding
- LANL LDRD-ER program
- U.S. Department of Energy through the Los Alamos National Laboratory
- Swedish national strategic e-science research program (eSSENCE)
- Computational Systems and Software Environments (CSSE) subprogram of LANL's ASC program (NNSA/DOE)
Ask authors/readers for more resources
In this paper, density matrix perturbation theory is mapped onto the computational structure of a deep neural network, and time-independent quantum response calculations are performed using Tensor cores. The main computational cost of each deep layer is dominated by tensor contractions in mixed-precision arithmetics, achieving close to peak performance. Quantum response calculations are demonstrated and analyzed with self-consistent charge density-functional tight-binding theory and coupled-perturbed Hartree-Fock theory. A novel parameter-free convergence criterion is presented for linear response calculations, suitable for numerically noisy low-precision floating point operations, and a peak performance of almost 200 Tflops is demonstrated using the Tensor cores of two Nvidia A100 GPUs.
Time-independent quantum response calculations are performed using Tensor cores. This is achieved by mapping density matrix perturbation theory onto the computational structure of a deep neural network. The main computational cost of each deep layer is dominated by tensor contractions, i.e., dense matrix-matrix multiplications, in mixed-precision arithmetics, which achieves close to peak performance. Quantum response calculations are demonstrated and analyzed using self-consistent charge density-functional tight-binding theory as well as coupled-perturbed Hartree-Fock theory. For linear response calculations, a novel parameter-free convergence criterion is presented that is well-suited for numerically noisy low-precision floating point operations and we demonstrate a peak performance of almost 200 Tflops using the Tensor cores of two Nvidia A100 GPUs.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available