3.8 Proceedings Paper

Stream-Dataflow Acceleration

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3079856.3080255

关键词

Streaming; Dataflow; Architecture; Accelerator; Reconfigurable; CGRA; Programmable; Domain-Specific

资金

  1. National Science Foundation [CCF-1618234]

向作者/读者索取更多资源

Demand for low-power data processing hardware continues to rise inexorably. Existing programmable and general purpose solutions (eg. SIMD, GPGPUs) are insufficient, as evidenced by the order-of-magnitude improvements and industry adoption of application and domain-specific accelerators in important areas like machine learning, computer vision and big data. The stark tradeoffs between efficiency and generality at these two extremes poses a difficult question: how could domain-specific hardware efficiency be achieved without domain-specific hardware solutions? In this work, we rely on the insight that acceleratable algorithms have broad common properties: high computational intensity with long phases, simple control patterns and dependences, and simple streaming memory access and reuse patterns. We define a general architecture (a hardware-software interface) which can more efficiently expresses programs with these properties called stream-dataflow. The dataflow component of this architecture enables high concurrency, and the stream component enables communication and coordination at very-low power and area overhead. This paper explores the hardware and software implications, describes its detailed microarchitecture, and evaluates an implementation. Compared to a state-of-the-art domain specific accelerator (DianNao), and fixed-function accelerators for MachSuite, Softbrain can match their performance with only 2x power overhead on average.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据