☆ 4.5 Article

Deeper Weight Pruning Without Accuracy Loss in Deep Neural Networks: Signed-Digit Representation-Based Approach

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS (2022)

期刊

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS

卷 41, 期 3, 页码 656-668

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCAD.2021.3064914

关键词

Parallel processing; Hardware; Neural networks; Acceleration; Performance evaluation; Computational modeling; Shortest path problem; Bit-level parallelism; deep neural networks (DNNs); signed-digit representation; weight pruning

类别

Computer Science, Hardware & Architecture Computer Science, Interdisciplinary Applications Engineering, Electrical & Electronic

资金

IC Design Education Center (IDEC), South Korea

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In addition to word-level weight pruning, recent studies have shown that bit-level weight pruning is highly effective in accelerating neural network computation without sacrificing accuracy. This work proposes a transformation technique, a shortest path problem formulation, and a novel acceleration architecture to overcome the limitations of bit-level weight pruning. Experimental results demonstrate significant reductions in essential bits and inference computation time.

In addition to the word-level weight pruning, which excludes the 0-value weights from the neural network inference computation, it is recently demonstrated that the bit-level weight pruning, which excludes the 0-bits in the weight value representation regardless of whether the weight values are zero or not, is very effective to further accelerate the neural network computation without accuracy loss. This work overcomes the inherent limitation of the bit-level weight pruning, that is, the maximal computation speedup is bounded by the total number of nonzero bits of the weights and the bound is invariably considered uncontrollable (i.e., constant) for the neural network to be pruned. Precisely, this work, based on the signed-digit encoding 1) proposes a transformation technique which converts the two's complement representation of every weight into a set of signed-digit representations of the minimal number of essential (i.e., nonzero) bits; 2) formulates the problem of selecting signed-digit representations of weights that maximize the parallelism of bit-level multiplication on the weights into a objective shortest path problem to achieve a maximal digit-index by digit-index (i.e., columnwise) compression for the weights and solves it efficiently using an approximation algorithm; 3) proposes a supporting novel acceleration architecture (DWP) with no additional inclusion of nontrivial hardware; and 4) proposes a variant of DWP to support bit-level parallel multiplication with the capability of predicting a tight worst-case latency of the parallel processing. Through experiments on several representative models using the ImageNet dataset, it is shown that our proposed approach is able to reduce the number of essential bits by 69% on AlexNet, 74% on VGG-16, and 68% on ResNet-152, by which our accelerator is able to reduce the inference computation time by up to 3.57 over the conventional bit-level weight pruning.

Deeper Weight Pruning Without Accuracy Loss in Deep Neural Networks: Signed-Digit Representation-Based Approach

期刊

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Deeper Weight Pruning Without Accuracy Loss in Deep Neural Networks: Signed-Digit Representation-Based Approach

期刊

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文