Journal
NEUROCOMPUTING
Volume 505, Issue -, Pages 375-387Publisher
ELSEVIER
DOI: 10.1016/j.neucom.2022.07.006
Keywords
Deep learning systems; Neural network compression; Filter pruning
Categories
Funding
- National Key R&D Program of China [2021ZD0110101]
- National Natural Science Foundation of China [61872043]
- CCF- Huawei Populus Grove Fund
- Fundamental Research Funds for the Central Universities
Ask authors/readers for more resources
This paper proposes a method called MaskACC, which accelerates the mask-based filter pruning process on modern CPU platforms and improves the computational efficiency of the pruning process.
Filter pruning, a representative model compression technique, has been widely used to compress and accelerate sophisticated deep neural networks on resource-constrained platforms. Nevertheless, most studies focus on reducing the cost of model inference, whereas the heavy burden of the pruning optimiza-tion process is neglected. In this paper, we propose MaskACC, a mask-aware convolutional computation method, which accelerates the prevailing mask-based filter pruning process on modern CPU platforms. MaskACC dynamically reorganizes the tensors used in convolutions with the mask information to avoid unnecessary computations, thereby improving the computational efficiency of the pruning process. Evaluation with state-of-the-art neural network models on CPU cloud platforms demonstrates the effec-tiveness of our method, which achieves up to 1.61x speedup under commonly-used pruning rates, com-pared to conventional computations. (c) 2022 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available