4.6 Article

TWACapsNet: a capsule network with two-way attention mechanism for speech emotion recognition

Journal

SOFT COMPUTING
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.1007/s00500-023-08957-5

Keywords

Speech emotion recognition; Attention mechanism; Neural networks

Ask authors/readers for more resources

This study proposes a Capsule Network with Two-Way Attention Mechanism (TWACapsNet) for the Speech Emotion Recognition (SER) problem. Experimental results demonstrate that the proposed method outperforms other neural network models on multiple SER datasets, and the combination of the two ways contributes to the higher and more stable performance of TWACapsNet.
Speech Emotion Recognition (SER) is a challenging task, and the typical convolutional neural network (CNN) cannot well handle the speech data directly. Because CNN tends to understand local information and ignores the overall characteristics. This paper proposes a Capsule Network with Two-Way Attention MechanismTWACapsNet for short) for the SER problem. TWACapsNet accepts the spatial and spectral features as inputs, and the convolutional layer and the capsule layer are deployed to process these two types of features in two ways separately. After that, two attention mechanisms are designed to enhance the information obtained from the spatial and spectral features. Finally, the results of these two ways are combined to form the final decision. The advantage of TWACapsNet is verified by experiments on multiple SER data sets, and experimental results show that the proposed method outperforms the widely-deployed neural network models on three typical SER data sets. Furthermore, the combination of the two ways contributes to the higher and more stable performance of TWACapsNet.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available