☆ 3.8 Proceedings Paper

Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification

INTERSPEECH 2021 (2021)

期刊

INTERSPEECH 2021

卷 -, 期 -, 页码 4713-4717

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC

DOI: 10.21437/Interspeech.2021-402

关键词

Spoken language understanding; speech to intent; knowledge distillation; transformer; BERT

类别

Audiology & Speech-Language Pathology Computer Science, Artificial Intelligence Computer Science, Software Engineering

资金

Agency for Science, Technology and Research (A*STAR) under its AME Programmatic Funding Scheme [A18A2b0046]
Science and Engineering Research Council, Agency of Science, Technology and Research, Singapore, through the National Robotics Program [192 25 00054]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

End-to-end intent classification using speech eliminates the need for intermediate ASR module but faces challenges due to lack of large speech resources. The study transfers knowledge from a language model to an acoustic model using transformer distillation method, achieving high accuracy in intent classification.

End-to-end intent classification using speech has numerous advantages compared to the conventional pipeline approach using automatic speech recognition (ASR), followed by natural language processing modules. It attempts to predict intent from speech without using an intermediate ASR module. However, such end-to-end framework suffers from the unavailability of large speech resources with higher acoustic variation in spoken language understanding. In this work, we exploit the scope of the transformer distillation method that is specifically designed for knowledge distillation from a transformer based language model to a transformer based speech model. In this regard, we leverage the reliable and widely used bidirectional encoder representations from transformers (BERT) model as a language model and transfer the knowledge to build an acoustic model for intent classification using the speech. In particular, a multi-level transformer based teacher-student model is designed, and knowledge distillation is performed across attention and hidden sub-layers of different transformer layers of the student and teacher models. We achieve an intent classification accuracy of 99.10% and 88.79% for Fluent speech corpus and ATIS database, respectively. Further, the proposed method demonstrates better performance and robustness in acoustically degraded condition compared to the baseline method.

Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification

期刊

INTERSPEECH 2021

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification

期刊

INTERSPEECH 2021

出版社

ISCA-INT SPEECH COMMUNICATION ASSOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文