☆ 4.7 Article

QTTNet: Quantized tensor train neural networks for 3D object and video recognition

NEURAL NETWORKS (2021)

期刊

NEURAL NETWORKS

卷 141, 期 -, 页码 420-432

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.neunet.2021.05.034

关键词

3DCNN; Tensor train decomposition; Neural network compression; Quantization; 8 bit inference

类别

Computer Science, Artificial Intelligence Neurosciences

资金

National Key R&D Program of China [2018AAA0102604, 2018YFE0200200]
National Science Foundation of China [61876215]
Beijing Academy of Artificial Intelligence (BAAI) , China
Institute for Guo Qiang, Tsinghua University
CAAI-Huawei MindSpore Open Fund, China
Beijing Science and Technology Program, China [Z191100007519009]
Open project of Zhejiang laboratory, China
Science and Technology Major Project of Guangzhou, China [202007030006]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article introduces a training framework for three-dimensional convolutional neural networks called QTTNet, which combines tensor train decomposition and data quantization to further shrink the model size and reduce memory and time costs. Experimental results demonstrate the effectiveness and competitiveness of this method in compressing 3D object and video recognition models.

Relying on the rapidly increasing capacity of computing clusters and hardware, convolutional neural networks (CNNs) have been successfully applied in various fields and achieved state-of-the-art results. Despite these exciting developments, the huge memory cost is still involved in training and inferring a large-scale CNN model and makes it hard to be widely used in resource-limited portable devices. To address this problem, we establish a training framework for three-dimensional convolutional neural networks (3DCNNs) named QTTNet that combines tensor train (TT) decomposition and data quantization together for further shrinking the model size and decreasing the memory and time cost. Through this framework, we can fully explore the superiority of TT in reducing the number of trainable parameters and the advantage of quantization in decreasing the bit-width of data, particularly compressing 3DCNN model greatly with little accuracy degradation. In addition, due to the low bit quantization to all parameters during the inference process including TT-cores, activations, and batch normalizations, the proposed method naturally takes advantage in memory and time cost. Experimental results of compressing 3DCNNs for 3D object and video recognition on ModelNet40, UCF11, and UCF50 datasets verify the effectiveness of the proposed method. The best compression ratio we have obtained is up to nearly 180x with competitive performance compared with other state-of-the-art researches. Moreover, the total bytes of our QTTNet models on ModelNet40 and UCF11 datasets can be 1000x lower than some typical practices such as MVCNN. (C) 2021 Published by Elsevier Ltd.

QTTNet: Quantized tensor train neural networks for 3D object and video recognition

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

QTTNet: Quantized tensor train neural networks for 3D object and video recognition

期刊

NEURAL NETWORKS

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文