☆ 4.7 Article

Multi-modal emotion recognition using EEG and speech signals

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

卷 149, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.compbiomed.2022.105907

关键词

Multi-modal emotion database; EEG emotion recognition; Speech emotion recognition; Physiological signal; Data fusion

类别

Biology Computer Science, Interdisciplinary Applications Engineering, Biomedical Mathematical & Computational Biology

资金

NSFC [61771403, N2018KF0157]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper presents a multi-modal emotion database (MED4) containing various signals recorded from participants when influenced by video stimuli. The performance of baseline algorithms and emotion recognition methods are evaluated, and fusion strategies are designed to improve accuracy and robustness. EEG signals outperform speech signals in emotion recognition, and fusion methods further enhance recognition performance in noisy environments.

Automatic Emotion Recognition (AER) is critical for naturalistic Human-Machine Interactions (HMI). Emotions can be detected through both external behaviors, e.g., tone of voice and internal physiological signals, e.g., electroencephalogram (EEG). In this paper, we first constructed a multi-modal emotion database, named Multi -modal Emotion Database with four modalities (MED4). MED4 consists of synchronously recorded signals of participants' EEG, photoplethysmography, speech and facial images when they were influenced by video stimuli designed to induce happy, sad, angry and neutral emotions. The experiment was performed with 32 participants in two environment conditions, a research lab with natural noises and an anechoic chamber. Four baseline algorithms were developed to verify the database and the performances of AER methods, Identification-vector + Probabilistic Linear Discriminant Analysis (I-vector + PLDA), Temporal Convolutional Network (TCN), Extreme Learning Machine (ELM) and Multi-Layer Perception Network (MLP). Furthermore, two fusion strategies on feature-level and decision-level respectively were designed to utilize both external and internal information of human status. The results showed that EEG signals generate higher accuracy in emotion recognition than that of speech signals (achieving 88.92% in anechoic room and 89.70% in natural noisy room vs 64.67% and 58.92% respectively). Fusion strategies that combine speech and EEG signals can improve overall accuracy of emotion recognition by 25.92% when compared to speech and 1.67% when compared to EEG in anechoic room and 31.74% and 0.96% in natural noisy room. Fusion methods also enhance the robustness of AER in the noisy environment. The MED4 database will be made publicly available, in order to encourage researchers all over the world to develop and validate various advanced methods for AER.

Multi-modal emotion recognition using EEG and speech signals

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-modal emotion recognition using EEG and speech signals

期刊

COMPUTERS IN BIOLOGY AND MEDICINE

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文