☆ 4.6 Article

Multimodal emotion recognition from facial expression and speech based on feature fusion

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

卷 82, 期 11, 页码 16359-16373

出版社

SPRINGER

DOI: 10.1007/s11042-022-14185-0

关键词

Multimodal emotion recognition; Attention mechanism; Deep learning; Feature fusion

类别

Computer Science, Information Systems Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper introduces a multimodal emotion recognition method that utilizes an attention mechanism to fuse audio and video features and model time series, effectively improving recognition accuracy.

Multimodal emotion recognition is designed to use expression and speech information to identify individual behaviors. Feature fusion can enrich various modal information, which is an important method for multimodal emotion recognition. However, there are several modal information synchronizations and overfitting problems due to large feature dimensions. So, an attention mechanism is introduced to automate the network to pay attention to local effective information. It is used to perform audio and video feature fusion tasks and timing modeling tasks in the network. The main contributions are as follows: 1) the multi-head self-attention mechanism is used for feature fusion of audio and video data to avoid the influence of prior information on the fusion results, and 2) a bidirectional gated recurrent unit is used to model the time series of fusion features; furthermore, the autocorrelation coefficient in the time dimension is also calculated as attention for fusion. Experiment results show that the adopted attention mechanism can effectively improve the accuracy of multimodal emotion recognition.

Multimodal emotion recognition from facial expression and speech based on feature fusion

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multimodal emotion recognition from facial expression and speech based on feature fusion

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文