期刊
INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY
卷 114, 期 9, 页码 543-552出版社
WILEY
DOI: 10.1002/qua.24607
关键词
electron repulsion integrals; graphics processing unit; density functional theory; parallel processing
The computation of electron repulsion integrals (ERIs) is the most time-consuming process in the density functional calculation using Gaussian basis set. Many temporal ERIs are calculated, and most are stored on slower storage, such as cache or memory, because of the shortage of registers, which are the fastest storage in a central processing unit (CPU). Moreover, the heavy register usage makes it difficult to launch many concurrent threads on a graphics processing unit (GPU) to hide latency. Hence, we propose to optimize the calculation order of one-center ERIs to minimize the number of registers used, and to calculate each ERI with three or six co-operating threads. The performance of this method is measured on a recent CPU and a GPU. The proposed approach is found to be efficient for high angular basis functions with a GPU. When combined with a recent GPU, it accelerates the computation almost 4-fold. (c) 2014 Wiley Periodicals, Inc.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据