☆ 3.8 Proceedings Paper

4-bit Quantization of LSTM-based Speech Recognition Models

INTERSPEECH 2021 (2021)

期刊

INTERSPEECH 2021

卷 -, 期 -, 页码 2586-2590

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC

DOI: 10.21437/Interspeech.2021-1962

关键词

LSTM; HMM; RNN-T; quantization; INT4

类别

Audiology & Speech-Language Pathology Computer Science, Artificial Intelligence Computer Science, Software Engineering

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study investigates the impact of aggressive low-precision representations on two families of large LSTM-based architectures for Automatic Speech Recognition. It shows that with appropriate choice of quantizers and initializations, minimal accuracy loss can be achieved while improving recognition performance and limiting computational time. The quantization strategy demonstrated in the study limits degradation in 4-bit inference to 1.3% on challenging RNN-T models.

We investigate the impact of aggressive low-precision representations of weights and activations in two families of large LSTM-based architectures for Automatic Speech Recognition (ASR): hybrid Deep Bidirectional LSTM - Hidden Markov Models (DBLSTM-HMMs) and Recurrent Neural Network - Transducers (RNN-Ts). Using a 4-bit integer representation, a naive quantization approach applied to the LSTM portion of these models results in significant Word Error Rate (WER) degradation. On the other hand, we show that minimal accuracy loss is achievable with an appropriate choice of quantizers and initializations. In particular, we customize quantization schemes depending on the local properties of the network, improving recognition performance while limiting computational time. We demonstrate our solution on the Switchboard (SWB) and CallHome (CH) test sets of the NIST Hub5-2000 evaluation. DBLSTM-HMMs trained with 300 or 2000 hours of SWB data achieves < 0.5% and < 1% average WER degradation, respectively. On the more challenging RNN-T models, our quantization strategy limits degradation in 4-bit inference to 1.3%.

4-bit Quantization of LSTM-based Speech Recognition Models

期刊

INTERSPEECH 2021

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

4-bit Quantization of LSTM-based Speech Recognition Models

期刊

INTERSPEECH 2021

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文