4.7 Article

Speech emotion recognition using Ramanujan Fourier Transform

期刊

APPLIED ACOUSTICS
卷 201, 期 -, 页码 -

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.apacoust.2022.109133

关键词

Ramanujan Fourier Transform; SVM; KNN; Discriminant analysis; Machine learning

向作者/读者索取更多资源

A novel technique using Ramanujan Fourier Transform (RFT) for Speech Emotion Recognition (SER) analysis is proposed. The RFT is applied after numerically encoding the speech emotion data, and it projects the obtained numerical series onto a collection of fundamental functions consisting of Ramanujan sums (RS). Multiple SER databases are considered for accuracy testing. The research shows that the RFT feature-based speech emotion classification using multiclass SVM classifier outperforms other classifiers in terms of accuracy. The results pave the way for real-world applications of speech emotion analysis.
A novel technique is presented for the analysis of Speech Emotion Recognition (SER) using Ramanujan Fourier Transform (RFT). The unique method involves numerically encoding the speech emotion data before applying the RFT. The RFT's foundation is the projection of the obtained numerical series onto a collection of fundamental functions made up of Ramanujan sums (RS). In RS components, SER data base such as Berlin, eNTERFACE, RAVDESS, SAVEE, EMOVO, EmoFilm, and Urdu are considered for testing the accuracy. This research work proposes on RFT feature based speech emotion classification. The speech emotion samples was analyzed by Ramanujan Fourier Transform and the statistical feature extraction was carried out, fed to the machine learning classifiers. The multiclass SVM based speech emotion classification was found to be proficient, when compared with the KNN and Linear Discriminant Analysis classifiers. The algorithms are evaluated on seven data bases and the results reveals that, multiclass SVM out performs other classifiers in terms of accuracy. The RFT as a stand-alone feature recognizes speech emotion with an accuracy of 83.08% for Berlin, 82.67% for eNTERFACE' 05, 81.79% for EmoFilm, 82.98% for RAVDESS, 82.99% for EMOVO, 84% for Urdu, and 83.75% for SAVEE databases using Multiclass SVM classifier. The outcome of this research work paves a way to the researchers in speech emotion analysis for real world applications. (c) 2022 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据