4.5 Article Proceedings Paper

McDRAM: Low Latency and Energy-Efficient Matrix Computations in DRAM

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCAD.2018.2857044

Keywords

Dynamic memory; memory architecture; neural networks (NNs); processing in memory

Ask authors/readers for more resources

We propose a novel memory architecture for in-memory computation called McDRAM, where DRAM dies are equipped with a large number of multiply accumulate (MAC) units to perform matrix computation for neural networks. By exploiting high internal memory bandwidth and reducing off-chip memory accesses, McDRAM realizes both low latency and energy efficient computation. In our experiments, we obtained the chip layout based on the state-of-the-art memory, LPDDR4 where McDRAM is equipped with 2048 MACs in a single chip package with a small area overhead (4.7%). Compared with the state-ofthe-art accelerator, TPU and the power-efficient GPU, Nvidia P4, McDRAM offers 9.5x and 14.4x speedup, respectively, in the case that the large-scale MLPs and RNNs adopt the batch size of 1. McDRAM also gives 2.1x and 3.7x better computational efficiency in TOPS/W than TPU and P4, respectively, for the large batches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available