☆ 4.7 Article

Speech emotion recognition using Ramanujan Fourier Transform

APPLIED ACOUSTICS (2022)

Journal

APPLIED ACOUSTICS

Volume 201, Issue -, Pages -

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.apacoust.2022.109133

Keywords

Ramanujan Fourier Transform; SVM; KNN; Discriminant analysis; Machine learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

A novel technique using Ramanujan Fourier Transform (RFT) for Speech Emotion Recognition (SER) analysis is proposed. The RFT is applied after numerically encoding the speech emotion data, and it projects the obtained numerical series onto a collection of fundamental functions consisting of Ramanujan sums (RS). Multiple SER databases are considered for accuracy testing. The research shows that the RFT feature-based speech emotion classification using multiclass SVM classifier outperforms other classifiers in terms of accuracy. The results pave the way for real-world applications of speech emotion analysis.

A novel technique is presented for the analysis of Speech Emotion Recognition (SER) using Ramanujan Fourier Transform (RFT). The unique method involves numerically encoding the speech emotion data before applying the RFT. The RFT's foundation is the projection of the obtained numerical series onto a collection of fundamental functions made up of Ramanujan sums (RS). In RS components, SER data base such as Berlin, eNTERFACE, RAVDESS, SAVEE, EMOVO, EmoFilm, and Urdu are considered for testing the accuracy. This research work proposes on RFT feature based speech emotion classification. The speech emotion samples was analyzed by Ramanujan Fourier Transform and the statistical feature extraction was carried out, fed to the machine learning classifiers. The multiclass SVM based speech emotion classification was found to be proficient, when compared with the KNN and Linear Discriminant Analysis classifiers. The algorithms are evaluated on seven data bases and the results reveals that, multiclass SVM out performs other classifiers in terms of accuracy. The RFT as a stand-alone feature recognizes speech emotion with an accuracy of 83.08% for Berlin, 82.67% for eNTERFACE' 05, 81.79% for EmoFilm, 82.98% for RAVDESS, 82.99% for EMOVO, 84% for Urdu, and 83.75% for SAVEE databases using Multiclass SVM classifier. The outcome of this research work paves a way to the researchers in speech emotion analysis for real world applications. (c) 2022 Elsevier Ltd. All rights reserved.

Speech emotion recognition using Ramanujan Fourier Transform

Journal

APPLIED ACOUSTICS

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Speech emotion recognition using Ramanujan Fourier Transform

Journal

APPLIED ACOUSTICS

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper