4.6 Article

Gradient Estimation for Ultra Low Precision POT and Additive POT Quantization

期刊

IEEE ACCESS
卷 11, 期 -, 页码 61264-61272

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2023.3286299

关键词

Deep neural network; non-uniform quantization; gradient estimation

向作者/读者索取更多资源

Deep learning networks achieve high accuracy for classification tasks but are usually too computationally and memory intensive for power-constrained devices. Low-bit quantization is an effective technique to reduce this burden, but it introduces quantization error and decreases classification accuracy. We propose a quantization error-aware gradient estimation method for power-of-two and additive power-of-two quantization, which minimizes quantization error by aligning weight update with projection steps. We also apply per-channel quantization to minimize accuracy degradation caused by the rigid resolution property of power-of-two quantization. This approach enables comparable accuracy even at ultra-low bit precision.
Deep learning networks achieve high accuracy for many classification tasks in computer vision and natural language processing. As these models are usually over-parameterized, the computations and memory required are unsuitable for power-constrained devices. One effective technique to reduce this burden is through low-bit quantization. However, the introduced quantization error causes a drop in the classification accuracy and requires design rethinking. To benefit from the hardware-friendly power-of-two (POT) and additive POT quantization, we explore various gradient estimation methods and propose quantization error-aware gradient estimation that manoeuvres weight update to be as close to the projection steps as possible. The clipping or scaling coefficients of the quantization scheme are learned jointly with the model parameters to minimize quantization error. We also apply per-channel quantization on POT and additive POT quantized models to minimize the accuracy degradation due to the rigid resolution property of POT quantization. We show that comparable accuracy can be achieved when using the proposed gradient estimation for POT quantization, even at ultra-low bit precision.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据