4.7 Article

MedQ: Lossless ultra-low-bit neural network quantization for medical image segmentation

期刊

MEDICAL IMAGE ANALYSIS
卷 73, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.media.2021.102200

关键词

Neural network quantization; Image segmentation; Deep learning; Model acceleration; Model compression

资金

  1. K. S. Lo Foundation
  2. Research Grants Council of Hong Kong General Research Fund [16245016]

向作者/读者索取更多资源

This paper introduces a novel CNN quantization framework that compresses deep models to extremely low bitwidth while maintaining high performance. By utilizing an optimized quantizer, radical residual connection scheme, tanh-based derivative function, and distributional loss, the framework achieves superior results compared to state-of-the-art quantization methods, demonstrating lossless performance with ternary quantization on two 3D segmentation datasets.
Implementing deep convolutional neural networks (CNNs) with boolean arithmetic is ideal for eliminating the notoriously high computational expense of deep learning models. However, although lossless model compression via weight-only quantization has been achieved in previous works, it is still an open problem about how to reduce the computation precision of CNNs without losing performance, especially for medical image segmentation tasks where data dimension is high and annotation is scarce. This paper presents a novel CNN quantization framework that can squeeze a deep model (both parameters and activation) to extremely low bitwidth, e.g., 1 similar to 2 bits, while maintaining its high performance. In the new method, we first design a strong baseline quantizer with an optimizable quantization range. Then, to relieve the back-propagation difficulty caused by the discontinuous quantization function, we design a radical residual connection scheme that allows gradients to flow through every quantized layer freely. Moreover, a tanh-based derivative function is used to further boost gradient flow and a distributional loss is employed to regularize the model output. Extensive experiments and ablation studies are conducted on two well-established public 3D segmentation datasets, i.e., BRATS2020 and LiTS. Experimental results evidence that our framework not only outperforms state-of-the-art quantization approaches significantly, but also achieves lossless performance on both datasets with ternary (2-bit) quantization. (C) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据