3.8 Proceedings Paper

Time-Frequency Representations: Spectrogram, Cochleogram and Correlogram

Publisher

ELSEVIER SCIENCE BV
DOI: 10.1016/j.procs.2020.03.209

Keywords

Acoustic; Cochlea; Spectrogram; Time-Frequency (T-F); Convolutional neural network (CNN)

Ask authors/readers for more resources

In recent years, the advancement in computer vision with deep learning based convolutional neural network (CNN) has raised many research interest. Multi-resolution time-frequency (T-F) representation of acoustic signal (speech/music or sound) as an input to CNN is one of them. A convolutional layer of CNN processes an image of speech or sound (spectrogram or any other TF representation). Besides conventional spectrogram, many more multi-resolution T-F representations exist, in which, cochleogram and correlogram are the prime representative. The main issue which has emerged from this wide scope of research is that, the making selection of choice of the proper representations for better processing with increased accuracy. As, the T-F representation includes various rich features of the speech or sound content, and to make a selection for representations, which one is more useful than another, and which one is perceptually or psycho-physically relevant to auditory, is a very challenging task. This task pays more attraction with recent research attention towards the writing of this article. The aim of this article is to give a chronological, systematic and critical review of the existing literature on the spectrogram, cochleogram and correlogram consequences of speech/sound analysis of acoustic/auditory models and discuss the significances of our findings for their interrelation. (C) 2020 The Authors. Published by Elsevier B.V.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available