4.5 Article

Survey on bimodal speech emotion recognition from acoustic and linguistic information fusion

期刊

SPEECH COMMUNICATION
卷 140, 期 -, 页码 11-28

出版社

ELSEVIER
DOI: 10.1016/j.specom.2022.03.002

关键词

Speech emotion recognition; Affective computing; Audiotextual information; Bimodal fusion; Information fusion

资金

  1. New Energy and Industrial Technology Devel-opment Organization (NEDO) , Japan [JPNP20006]
  2. Japan Advanced Institute of Science and Technology (JAIST)
  3. Institut Teknologi Sepuluh Nopember (ITS)

向作者/读者索取更多资源

This paper presents a survey on bimodal emotion recognition which combines acoustic and linguistic information. It reviews five components of bimodal SER and presents major findings from commonly used datasets. The survey also proposes future research directions in this field.
Speech emotion recognition (SER) is traditionally performed using merely acoustic information. Acoustic features, commonly are extracted per frame, are mapped into emotion labels using classifiers such as support vector machines for machine learning or multi-layer perceptron for deep learning. Previous research has shown that acoustic-only SER suffers from many issues, mostly on low performances. On the other hand, not only acoustic information can be extracted from speech but also linguistic information. The linguistic features can be extracted from the transcribed text by an automatic speech recognition system. The fusion of acoustic and linguistic information could improve the SER performance. This paper presents a survey of the works on bimodal emotion recognition fusing acoustic and linguistic information. Five components of bimodal SER are reviewed: emotion models, datasets, features, classifiers, and fusion methods. Some major findings, including state-of-the-art results and their methods from the commonly used datasets, are also presented to give insights for the current research and to surpass these results. Finally, this survey proposes the remaining issues in the bimodal SER research for future research directions.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据