4.6 Article

Multimodal Emotion Recognition Using a Hierarchical Fusion Convolutional Neural Network

期刊

IEEE ACCESS
卷 9, 期 -, 页码 7943-7951

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3049516

关键词

Feature extraction; Emotion recognition; Brain modeling; Electroencephalography; Physiology; Deep learning; Data models; Deep learning; electroencephalogram; hierarchical convolutional neural network; multimodal emotion recognition; multiscale features

资金

  1. National Natural Science Foundation of China [61772252]
  2. Natural Science Foundation of Liaoning Province of China [2019-MS-216]
  3. Program for Liaoning Innovative Talents in University [LR2017044]

向作者/读者索取更多资源

The paper proposes a hierarchical fusion convolutional neural network model to mine potential information in data by constructing different network hierarchical structures, extracting multiscale features, and using feature-level fusion. Experiments show that the proposed model achieves accuracies of 84.71% and 89.00% on the valence and arousal dimensions of the DEAP and MAHNOB-HCI datasets, indicating its superiority in feature extraction and fusion compared to other deep learning emotion classification models.
In recent years, deep learning has been increasingly used in the field of multimodal emotion recognition in conjunction with electroencephalogram. Considering the complexity of recording electroencephalogram signals, some researchers have applied deep learning to find new features for emotion recognition. In previous studies, convolutional neural network model was used to automatically extract features and complete emotion recognition, and certain results were obtained. However, the extraction of hierarchical features with convolutional neural network for multimodal emotion recognition remains unexplored. Therefore, this paper proposes a hierarchical fusion convolutional neural network model to mine the potential information in the data by constructing different network hierarchical structures, extracting multiscale features, and using feature-level fusion to fuse the global features formed by combining weights with manually extracted statistical features to form the final feature vector. This paper conducts binary classification experiments on the valence and arousal dimensions of the DEAP and MAHNOB-HCI data sets to evaluate the performance of the proposed model. The results show that the model proposed in this paper can achieve accuracies of 84.71% and 89.00% on the two corresponding data sets, indicating that the model proposed in this paper is superior to other deep learning emotion classification models in feature extraction and fusion.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据