☆ 3.8 Proceedings Paper

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory

OPERATING SYSTEMS REVIEW (2017)

期刊

OPERATING SYSTEMS REVIEW

卷 51, 期 2, 页码 751-764

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3037697.3037702

关键词

3D memory; neural networks; acceleration; dataflow scheduling; partitioning

类别

Computer Science, Software Engineering

资金

Stanford Pervasive Parallelism Lab
Stanford Platform Lab
NSF [SHF-1408911]
Direct For Computer & Info Scie & Enginr
Division of Computing and Communication Foundations [1408911] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The high accuracy of deep neural networks (NNs) has led to the development of NN accelerators that improve performance by two orders of magnitude. However, scaling these accelerators for higher performance with increasingly larger NNs exacerbates the cost and energy overheads of their memory systems, including the on-chip SRAM buffers and the off-chip DRAM channels. This paper presents the hardware architecture and software scheduling and partitioning techniques for TETRIS, a scalable NN accelerator using 3D memory. First, we show that the high throughput and low energy characteristics of 3D memory allow us to rebalance the NN accelerator design, using more area for processing elements and less area for SRAM buffers. Second, we move portions of the NN computations close to the DRAM banks to decrease bandwidth pressure and increase performance and energy efficiency. Third, we show that despite the use of small SRAM buffers, the presence of 3D memory simplifies dataflow scheduling for NN computations. We present an analytical scheduling scheme that matches the efficiency of schedules derived through exhaustive search. Finally, we develop a hybrid partitioning scheme that parallelizes the NN computations over multiple accelerators. Overall, we show that TETRIS improves the performance by 4.1x and reduces the energy by 1.5x over NN accelerators with conventional, low-power DRAM memory systems.

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory

期刊

OPERATING SYSTEMS REVIEW

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory

期刊

OPERATING SYSTEMS REVIEW

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文