3.8 Proceedings Paper

Bandwidth-Aware Loop Tiling for DMA-Supported Scratchpad Memory

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3410463.3414637

关键词

Loop Tiling; DMA; Scratchpad Memory; Compiler

资金

  1. National Key Research and Development Program of China [2017YFB0202002]
  2. Strategic Priority Research Program of Chinese Academy of Sciences [XDC05030101]
  3. National Natural Science Foundation of China [61802368, 61521092, 61432016, 61432018, 61332009, 61702485, 61872043]
  4. CCF-Tencent Open Research Fund
  5. Australian Research Council [DP170103956, DP180104069]

向作者/读者索取更多资源

Scratchpad Memory (SPM) is widely used in emerging domain-specific architectures and accelerators for improving energy efficiency and time predictability. Typically, SPM-based architectures use DMA for fetching data from off-chip memory and global load instructions for loading fine-grained data directly into registers. For such architectures, neither capacity-only nor bandwidth-only loop tiling can efficiently use the bandwidth and SPM. This paper introduces a bandwidth-aware loop tiling approach that enables a tradeoff between SPM space utilization and bandwidth utilization to be made, by leveraging a runtime tiling framework and a cross-host-kernel IPA. Experimental results demonstrate that our approach can achieve the performance improvement of up to 4x, with a geometric average of 26%.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据