3.8 Proceedings Paper

Compiler Assisted Hybrid Implicit and Explicit GPU Memory Management under Unified Address Space

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3295500.3356141

关键词

Unified Memory Management; GPU; Compiler Analysis; Runtime; Implicit and Explicit Data Transfer; Reuse Distance; OpenMP

资金

  1. Exascale Computing Project [17-SC-20-SC]
  2. Office of Science of the U.S. Department of Energy [DE-AC05-00OR22725]

向作者/读者索取更多资源

To improve programmability and productivity, recent GPUs adopt a virtual memory address space shared with CPUs (e.g., NVIDIA's unified memory). Unified memory migrates the data management burden from programmers to system software and hardware, and enables GPUs to address datasets that exceed their memory capacity. Our experiments show that while the implicit data transfer of unified memory may bring better data movement efficiency, page fault overhead and data thrashing can erase its benefits. In this paper, we propose several user-transparent unified memory management schemes to 1) achieve adaptive implicit and explicit data transfer and 2) prevent data thrashing. Unlike previous approaches which mostly rely on the runtime and thus suffer from large overhead, we demonstrate the benefits of exploiting key information from compiler analyses, including data locality, access density, and target reuse distance, to accomplish our goal. We implement the proposed schemes to improve OpenMP GPU offloading performance. Our evaluation shows that our schemes improve the GPU performance and memory efficiency significantly.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据