4.2 Article

Efficiency of chosen speech descriptors in relation to emotion recognition

Publisher

SPRINGEROPEN
DOI: 10.1186/s13636-017-0100-x

Keywords

Voice; Emotion recognition; Perceptual coefficients; Speech signal analysis

Funding

  1. Estonian Research Grant [PUT638]
  2. Estonia-Poland (Est-Pol) Research Collaboration Project (MAJoRA: Multimodal Anger and Joy Recognition by Audiovisual Information)
  3. Estonian Centre of Excellence in IT (EXCITE) - European Regional Development Fund

Ask authors/readers for more resources

This research paper presents parametrization of emotional speech using a pool of common features utilized in emotion recognition such as fundamental frequency, formants, energy, MFCC, PLP, and LPC coefficients. The pool is additionally expanded by perceptual coefficients such as BFCC, HFCC, RPLP, and RASTA PLP, which are used in speech recognition, but not applied in emotion detection. The main contribution of this work is the comparison of the accuracy performance of emotion detection for each feature type based on the results provided by both k-NN and SVM algorithms with 10-fold cross-validation. Analysis was performed on two different Polish emotional speech databases: voice performances by professional actors in comparison with the author's spontaneous speech.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available