3.8 Proceedings Paper

Combining Weight Pruning and Knowledge Distillation For CNN Compression

出版社

IEEE COMPUTER SOC
DOI: 10.1109/CVPRW53098.2021.00356

关键词

-

向作者/读者索取更多资源

Model compression is crucial for deep neural networks, with popular methods like weight pruning not suitable for complex networks like ResNets.
Complex deep convolutional neural networks such as ResNet require expensive hardware such as powerful GPUs to achieve real-time performance. This problem is critical for applications that run on low-end embedded GPU or CPU systems with limited resources. As a result, model compression for deep neural networks becomes an important research topic. Popular compression methods such as weight pruning remove redundant neurons from the CNN without affecting the network's output accuracy. While these pruning methods work well on simple networks such as VGG or AlexNet, they are not suitable for compressing current state-of-the-art networks such as ResNets because of these networks' complex architectures with dimensionality dependencies. This dependency results in filter pruning breaking the structure of ResNets leading to an untrainable network. In this paper, we first use the weight pruning method only on a selective number of layers in the ResNet architecture to avoid breaking the network structure. Second, we introduce a knowledge distillation architecture and a loss function to compress the untouched layers during the pruning. We test our method on both image-based regression and classification networks for head-pose estimation and image classification. Our compression method reduces the models' size significantly while maintaining the accuracy very close to the baseline model.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据