4.7 Article

Speech emotion recognition using Ramanujan Fourier Transform

Journal

APPLIED ACOUSTICS
Volume 201, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.apacoust.2022.109133

Keywords

Ramanujan Fourier Transform; SVM; KNN; Discriminant analysis; Machine learning

Categories

Ask authors/readers for more resources

A novel technique using Ramanujan Fourier Transform (RFT) for Speech Emotion Recognition (SER) analysis is proposed. The RFT is applied after numerically encoding the speech emotion data, and it projects the obtained numerical series onto a collection of fundamental functions consisting of Ramanujan sums (RS). Multiple SER databases are considered for accuracy testing. The research shows that the RFT feature-based speech emotion classification using multiclass SVM classifier outperforms other classifiers in terms of accuracy. The results pave the way for real-world applications of speech emotion analysis.
A novel technique is presented for the analysis of Speech Emotion Recognition (SER) using Ramanujan Fourier Transform (RFT). The unique method involves numerically encoding the speech emotion data before applying the RFT. The RFT's foundation is the projection of the obtained numerical series onto a collection of fundamental functions made up of Ramanujan sums (RS). In RS components, SER data base such as Berlin, eNTERFACE, RAVDESS, SAVEE, EMOVO, EmoFilm, and Urdu are considered for testing the accuracy. This research work proposes on RFT feature based speech emotion classification. The speech emotion samples was analyzed by Ramanujan Fourier Transform and the statistical feature extraction was carried out, fed to the machine learning classifiers. The multiclass SVM based speech emotion classification was found to be proficient, when compared with the KNN and Linear Discriminant Analysis classifiers. The algorithms are evaluated on seven data bases and the results reveals that, multiclass SVM out performs other classifiers in terms of accuracy. The RFT as a stand-alone feature recognizes speech emotion with an accuracy of 83.08% for Berlin, 82.67% for eNTERFACE' 05, 81.79% for EmoFilm, 82.98% for RAVDESS, 82.99% for EMOVO, 84% for Urdu, and 83.75% for SAVEE databases using Multiclass SVM classifier. The outcome of this research work paves a way to the researchers in speech emotion analysis for real world applications. (c) 2022 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available