3.8 Proceedings Paper

StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems

出版社

IEEE
DOI: 10.1109/CGO51591.2021.9370315

关键词

-

资金

  1. European Research Council under the European Union's Horizon 2020 programme (grant agreement DAPP) [678880]
  2. Swiss National Science Foundation (Ambizione Project) [185778]
  3. Paderborn Center for Parallel Computing (PC2)

向作者/读者索取更多资源

This study aims to map heterogeneous stencil computations to spatial computing systems to improve temporal locality and ensure deadlock freedom. The research achieved the highest performance record on FPGAs to date and successfully investigated a complex stencil program from a production application.
Spatial computing devices have been shown to significantly accelerate stencil computations, but have so far relied on unrolling the iterative dimension of a single stencil operation to increase temporal locality. This work considers the general case of mapping directed acyclic graphs of heterogeneous stencil computations to spatial computing systems, assuming large input programs without an iterative component. StencilFlow maximizes temporal locality and ensures deadlock freedom in this setting, providing end-to-end analysis and mapping from a high-level program description to distributed hardware. We evaluate our generated architectures on a Stratix 10 FPGA testbed, yielding 1.31 TOp /s and 4.18 TOp/s on single-device and multi-device, respectively, demonstrating the highest performance recorded for stencil programs on FPGAs to date. We then leverage the framework to study a complex stencil program from a production weather simulation application. Our work enables productively targeting distributed spatial computing systems with large stencil programs, and offers insight into architecture characteristics required for their efficient execution in practice.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据