☆ 4.7 Article

QLP: Deep Q-Learning for Pruning Deep Neural Networks

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

卷 32, 期 10, 页码 6488-6501

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSVT.2022.3167951

关键词

Training; Neural networks; Indexes; Computer architecture; Deep learning; Biological neural networks; Task analysis; Deep neural network compression; pruning; deep reinforcement learning

类别

Engineering, Electrical & Electronic

资金

Agency for Science, Technology and Research (A*STAR) under its AME Programmatic Funds [A1892b0026, A19E3b0099]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this paper, a novel method called QLP is proposed for pruning deep neural networks using deep Q-learning. The method achieves superior granular pruning by visiting each layer multiple times and pruning them little by little at each visit. It has the flexibility to execute a whole range of sparsity ratios for each layer, which enables aggressive pruning without compromising accuracy. Furthermore, the method features a simple, generic state definition and utilizes a carefully designed curriculum to deliver better accuracy at high sparsity levels.

We present a novel, deep Q-learning based method, QLP, for pruning deep neural networks (DNNs). Given a DNN, our method intelligently determines favorable layer-wise sparsity ratios, which are then implemented via unstructured, magnitude-based, weight pruning. In contrast to previous reinforcement learning (RL) based pruning methods, our method is not forced to prune a DNN within a single, sequential pass from the first layer to the last. It visits each layer multiple times and prunes them little by little at each visit, achieving superior granular pruning. Moreover, our method is not restricted to a subset of actions within the feasible action space. It has the flexibility to execute a whole range of sparsity ratios (0% - 100%) for each layer. This enables aggressive pruning without compromising accuracy. Furthermore, our method does not require a complex state definition; it features a simple, generic definition that is composed of only the index and the density of the layers, which leads to less computational demand while observing the state at each interaction. Lastly, our method utilizes a carefully designed curriculum that enables learning targeted policies for each sparsity regime, which helps to deliver better accuracy, especially at high sparsity levels. We conduct batched performance tests at compelling sparsity levels (up to 98%), present extensive ablation studies to justify our RL-related design choices, and compare our method with the state-of-the-art, including RL-based and other pruning methods. Our method sets the new state-of-the-art results in most of the experiments with ResNet-32 and ResNet-56 over CIFAR-10 dataset as well as ResNet-50 and MobileNet-v1 over ILSVRC2012 (ImageNet) dataset.

QLP: Deep Q-Learning for Pruning Deep Neural Networks

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

QLP: Deep Q-Learning for Pruning Deep Neural Networks

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文