4.6 Article

Tri-integrated convolutional neural network for audio image classification using Mel-frequency spectrograms

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 82, Issue 4, Pages 5521-5546

Publisher

SPRINGER
DOI: 10.1007/s11042-022-13358-1

Keywords

Transfer learning; VGG16; VGG19; TiCNN; Data augmentation

Ask authors/readers for more resources

This paper proposes an integrated methodology called TiCNN for emotion classification based on Mel-frequency spectrograms. By training and validating on multiple datasets, the proposed method achieves high accuracy and performance.
Emotion is a state which encompasses a variety of physiological phenomena. Classification of emotions has many applications in fields like customer review, product evaluation, national security, etc., thus making it a prominent area of research. The state-of-art methodologies have used either text or audio files to classify emotions which is in contrast to the proposed work which utilizes the Mel-frequency spectrograms. An integrated methodology TiCNN (Tri integrated Convolutional Neural Network) has been proposed for classifying emotions into eight different classes. Three models namely VGG16, VGG19, and a proposed CNN architecture have been integrated and trained on the RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) dataset. The proposed integrated TiCNN approach classifies emotions into eight different classes with an accuracy of 93.27%. Precision, recall and F1-Score of 0.93, 0.92 and 0.92 have also been used as metrics to evaluate the performance of the proposed model. Further, for model validation, the efficiency and efficacy of the proposed methodology have been compared and analysed with the EMO-DB (Berlin Database of Emotional Speech) dataset. The proposed TiCNN model gives an accuracy of 92.78% on the EMO-DB dataset. Empirical evaluation of the proposed methodology has been compared with conventional transfer learning models and state-of-the-art methodologies, where it has shown its superiority over others.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available