3.8 Proceedings Paper

Bandwidth-Aware Loop Tiling for DMA-Supported Scratchpad Memory

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3410463.3414637

Keywords

Loop Tiling; DMA; Scratchpad Memory; Compiler

Funding

  1. National Key Research and Development Program of China [2017YFB0202002]
  2. Strategic Priority Research Program of Chinese Academy of Sciences [XDC05030101]
  3. National Natural Science Foundation of China [61802368, 61521092, 61432016, 61432018, 61332009, 61702485, 61872043]
  4. CCF-Tencent Open Research Fund
  5. Australian Research Council [DP170103956, DP180104069]

Ask authors/readers for more resources

Scratchpad Memory (SPM) is widely used in emerging domain-specific architectures and accelerators for improving energy efficiency and time predictability. Typically, SPM-based architectures use DMA for fetching data from off-chip memory and global load instructions for loading fine-grained data directly into registers. For such architectures, neither capacity-only nor bandwidth-only loop tiling can efficiently use the bandwidth and SPM. This paper introduces a bandwidth-aware loop tiling approach that enables a tradeoff between SPM space utilization and bandwidth utilization to be made, by leveraging a runtime tiling framework and a cross-host-kernel IPA. Experimental results demonstrate that our approach can achieve the performance improvement of up to 4x, with a geometric average of 26%.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available