4.5 Article

E-LSTM: An Efficient Hardware Architecture for Long Short-Term Memory

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JETCAS.2019.2911739

Keywords

Hardware acceleration; long short-term memory (LSTM); model compression; recurrent neural network (RNN); deep learning; FPCA

Funding

  1. National Natural Science Foundation of China [61774082, 61604068]
  2. Fundamental Research Funds for the Central Universities [021014380065, 021014380087]

Ask authors/readers for more resources

Long Short-Term Memory (LSTM) and its variants have been widely adopted in many sequential learning tasks, such as speech recognition and machine translation. Significant accuracy improvements can be achieved using complex LSTM model with a large memory requirement and high computational complexity, which is time-consuming and energy demanding. The low-latency and energy-efficiency requirements of the realworld applications make model compression and hardware acceleration for LSTM an urgent need. In this paper, several hardware-efficient network compression schemes are introduced first, including structured top-k pruning, clipped gating, and multiplication-free quantization, to reduce the model size and the number of matrix operations by 32x and 21.6x, respectively, with negligible accuracy loss. Furthermore, efficient hardware architectures for accelerating the compressed LSTM are proposed, which support the inference of multi-layer and multiple time steps. The computation process is judiciously reorganized and the memory access pattern is well optimized, which alleviate the limited memory bandwidth bottleneck and enable higher throughput. Moreover, the parallel processing strategy is carefully designed to make full use of the sparsity introduced by pruning and clipped gating with high hardware utilization efficiency. Implemented on Intel Arria10 SX660 FPCA running at 200MHz, the proposed design is able to achieve 1.4-2.2x energy efficiency and requires significantly less hardware resources compared with the state-of-the-art LSTM implementations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available