4.7 Article

Dynamical Channel Pruning by Conditional Accuracy Change for Deep Neural Networks

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2020.2979517

Keywords

Training; Channel estimation; Logic gates; Computer architecture; Convolution; Biological neural networks; Automation; Conditional accuracy change (CAC); direct criterion; dynamical channel pruning; neural network compression; structure shaping

Funding

  1. National Natural Science Foundation of China [61976209, 61721004]
  2. CAS International Collaboration Key Project
  3. Strategic Priority Research Program of CAS [XDB32040200]

Ask authors/readers for more resources

This article introduces a dynamic channel pruning method that optimizes the pruning process, reducing the parameters and computations of neural networks effectively while maintaining higher accuracy. By shaping a more desirable network structure, significant performance improvements can be achieved.
Channel pruning is an effective technique that has been widely applied to deep neural network compression. However, many existing methods prune from a pretrained model, thus resulting in repetitious pruning and fine-tuning processes. In this article, we propose a dynamical channel pruning method, which prunes unimportant channels at the early stage of training. Rather than utilizing some indirect criteria (e.g., weight norm, absolute weight sum, and reconstruction error) to guide connection or channel pruning, we design criteria directly related to the final accuracy of a network to evaluate the importance of each channel. Specifically, a channelwise gate is designed to randomly enable or disable each channel so that the conditional accuracy changes (CACs) can be estimated under the condition of each channel disabled. Practically, we construct two effective and efficient criteria to dynamically estimate CAC at each iteration of training; thus, unimportant channels can be gradually pruned during the training process. Finally, extensive experiments on multiple data sets (i.e., ImageNet, CIFAR, and MNIST) with various networks (i.e., ResNet, VGG, and MLP) demonstrate that the proposed method effectively reduces the parameters and computations of baseline network while yielding the higher or competitive accuracy. Interestingly, if we Double the initial Channels and then Prune Half (DCPH) of them to baseline's counterpart, it can enjoy a remarkable performance improvement by shaping a more desirable structure.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available