4.6 Article

Accelerating deep neural network filter pruning with mask-aware convolutional computations on modern CPUs

Journal

NEUROCOMPUTING
Volume 505, Issue -, Pages 375-387

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2022.07.006

Keywords

Deep learning systems; Neural network compression; Filter pruning

Funding

  1. National Key R&D Program of China [2021ZD0110101]
  2. National Natural Science Foundation of China [61872043]
  3. CCF- Huawei Populus Grove Fund
  4. Fundamental Research Funds for the Central Universities

Ask authors/readers for more resources

This paper proposes a method called MaskACC, which accelerates the mask-based filter pruning process on modern CPU platforms and improves the computational efficiency of the pruning process.
Filter pruning, a representative model compression technique, has been widely used to compress and accelerate sophisticated deep neural networks on resource-constrained platforms. Nevertheless, most studies focus on reducing the cost of model inference, whereas the heavy burden of the pruning optimiza-tion process is neglected. In this paper, we propose MaskACC, a mask-aware convolutional computation method, which accelerates the prevailing mask-based filter pruning process on modern CPU platforms. MaskACC dynamically reorganizes the tensors used in convolutions with the mask information to avoid unnecessary computations, thereby improving the computational efficiency of the pruning process. Evaluation with state-of-the-art neural network models on CPU cloud platforms demonstrates the effec-tiveness of our method, which achieves up to 1.61x speedup under commonly-used pruning rates, com-pared to conventional computations. (c) 2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available