4.6 Article

Accent classification from an emotional speech in clean and noisy environments

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 82, Issue 3, Pages 3485-3508

Publisher

SPRINGER
DOI: 10.1007/s11042-022-13236-w

Keywords

Accent classification; Spectral features; Machine learning classifier; Emotion classification

Ask authors/readers for more resources

This study aims to build effective accent recognition systems based on emotional speech. By applying statistical aggregation functions on different features and conducting experiments using clean and noisy speech signals, it is found that some features perform well on noisy data while the robustness of others depends on whether there is noisy training data.
The performance of speech emotion recognition systems (SER) suffers when emotional speech is spoken in different accents. One possible solution to such a problem is to identify the accent beforehand and use this knowledge in the SER task. The present work is one of the novel attempts in this regard to build effective accent recognition systems based on emotional speech. In this regard, statistical aggregation functions (like mean, std, kurtosis, etc.) have been applied on frame-level feature representations such as perceptual linear prediction (PLP), log filterbank energies (LFBE), Mel frequency cepstral coefficients (MFCC), spectral subband centroid (SSC), constant-Q cepstral coefficients (CQCC), chroma vector and Mel frequency discrete wavelet coefficients (MFDWC) to obtain utterance-level features from CREMA-D, an emotional dataset. The performance of the features for different standard classifiers is obtained by conducting experiments using clean and noisy speech signals. Finally, the experimental results show that the SSC features perform well on noisy data only when it is trained with noisy data. On the other hand, the combined MFDWC features perform well on noisy data for both clean and noisy training data. This hints at the noise-robustness of this feature set. On the other hand, we can only say that SSC is conditionally robust. We hope this work will initiate a new line of research in emotion recognition.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available