4.7 Article

Design of Fully Spectral CNNs for Efficient FPGA-Based Acceleration

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2022.3224779

Keywords

Spectral analysis; Frequency-domain analysis; Field programmable gate arrays; Convolution; Convolutional neural networks; Pipelines; Resource management; Convolutional neural networks (CNNs); field programmable gate arrays (FPGAs); fully spectral convolution networks; hardware accelerator; spectral representations

Funding

  1. National Natural Science Foundation of China [62001165]
  2. Hunan Provincial Natural Science Foundation of China [2021JJ40357]
  3. Changsha Municipal Natural Science Foundation [kq2014079]
  4. Hunan Province College Students Research Learning and Innovative Experiment Project [S202110542114]

Ask authors/readers for more resources

Computing convolutional layers in the frequency domain using FFT can reduce computational complexity, but the frequent transformations between spatial and frequency domains hinder low-latency inference. To address this, a fully spectral CNN is proposed, which eliminates the transformations using a novel spectral-domain adaptive ReLU layer. Additionally, a customized hardware architecture is proposed to accelerate the fully spectral CNN inference on FPGA, achieving improved throughput compared to state-of-the-art implementations.
Computing convolutional layers in the frequency domain using fast Fourier transformation (FFT) has been demonstrated to be effective in reducing the computational complexity of convolutional neural networks (CNNs). Nevertheless, the main challenge of this approach lies in the frequent and repeated transformations between the spatial and frequency domains due to the absence of nonlinear functions in the spectral domain, as such it makes the benefit less attractive for low-latency inference, especially on embedded platforms. To overcome the drawbacks in the existing FFT-based convolution, we propose a fully spectral CNN using a novel spectral-domain adaptive rectified linear unit (ReLU) layer, which completely removes the compute-intensive transformations between the spatial and frequency domains within the network. The proposed fully spectral CNNs maintain the nonlinearity of the spatial CNNs while taking into account the hardware efficiency. We then propose a deeply customized and compute-efficient hardware architecture to accelerate the fully spectral CNN inference on field programmable gate array (FPGA). Different hardware optimizations, such as spectral-domain intralayer and interlayer pipeline techniques, are introduced to further improve the performance of throughput. To achieve a load-balanced pipeline, a design space exploration (DSE) framework is proposed to optimize the resource allocation between hardware modules according to the resource constraints. On an Intel's Arria 10 SX160 FPGA, our optimized accelerator achieves a throughput of 204 Gop/s with 80% of compute efficiency. Compared with the state-of-the-art spatial and FFT-based implementations on the same device, our accelerator is 4x similar to 6.6x and 3.0x similar to 4.4x faster while maintaining a similar level of accuracy across different benchmark datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available