4.6 Article

C-DNN: An Energy-Efficient Complementary Deep-Neural-Network Processor With Heterogeneous CNN/SNN Core Architecture

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSSC.2023.3330483

关键词

Application-specific integrated circuit (ASIC); complementary deep neural network (C-DNN); convolutional neural network (CNN); deep learning; deep neural network; spiking neural network (SNN)

向作者/读者索取更多资源

The article proposes a complementary deep-neural-network processor that combines CNN and SNN, enabling both inference and training. The processor achieves energy efficiency through workload division and integrates various modules for optimal performance. It achieves state-of-the-art results for image classification tasks.
In this article, we propose a complementary deep-neural-network (C-DNN) processor by combining convolutional neural network (CNN) and spiking neural network (SNN) to take advantage of them. The C-DNN processor can support both complementary inference and training with heterogeneous CNN and SNN core architecture. In addition, the C-DNN processor is the first DNN accelerator application-specific integrated circuit (ASIC) that can support CNN-SNN workload division by using their magnitude-energy tradeoff. The C-DNN processor integrates the CNN-SNN workload allocator and attention module to find a more energy-efficient network domain for each workload in DNN. They enable the C-DNN processor to operate at the energy optimal point. Moreover, the SNN processing element (PE) array with distributed L1 cache can reduce the redundant memory access for SNN processing, resulting in a 42.2%-49.1% reduction. For high energy-efficient DNN training, the C-DNN processor integrates the global counter and local delta-weight (LDW) unit to eliminate power-consuming counters for a forward delta-weight generation. Furthermore, the forward delta-weight-based sparsity generation (FDWSG) is proposed to reduce the number of operations for training by 31%-79% The C-DNN processor achieves an energy efficiency of 85.8 and 79.9 TOPS/W for inference with CIFAR-10 and CIFAR-100, respectively (VGG-16). Moreover, the C-DNN processor achieves ImageNet classification with state-of-the-art energy efficiency of 24.5 TOPS/W (ResNet-50). For training, the C-DNN processor achieves the state-of-the-art energy efficiency of 84.5 and 17.2 TOPS/W for CIFAR-10 and ImageNet, respectively. Furthermore, it achieves 77.1% accuracy for ImageNet training with ResNet-50.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据