☆ 4.1 Article

Analysis of 2D Feature Spaces for Deep Learning-Based Speech Recognition

JOURNAL OF THE AUDIO ENGINEERING SOCIETY (2018)

Journal

JOURNAL OF THE AUDIO ENGINEERING SOCIETY

Volume 66, Issue 12, Pages 1072-1081

Publisher

AUDIO ENGINEERING SOC

DOI: 10.17743/jaes.2018.0066

Keywords

Funding

Polish National Science Centre [2015/17/B/ST6/01874]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The aim of the presented study was to evaluate the suitability of 2D audio signal feature maps for speech recognition based on deep learning. The proposed methodology employs a convolutional neural network (CNN) which is a class of deep. feed-forward artificial neural network. We decided to analyze audio signal feature maps, namely spectrograms, linear and mel-scale cepstrograms, and chromagrams. The choice was made upon the fact that CNN performs well in 2D data-oriented processing contexts. Feature maps were employed in the Lithuanian word recognition task. The spectral analysis led to the highest word recognition rate. Spectral and mel-scale cepstral feature spaces outperform linear cepstra and chroma. The 111-word classification experiment depicts f1 score of 0.99 for spectrum, 0.91 for mel-scale cepstrum, 0.76 for chromagram. and 0.64 for cepstrum feature space on test data set.

Analysis of 2D Feature Spaces for Deep Learning-Based Speech Recognition

Journal

JOURNAL OF THE AUDIO ENGINEERING SOCIETY

Publisher

AUDIO ENGINEERING SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Analysis of 2D Feature Spaces for Deep Learning-Based Speech Recognition

Journal

JOURNAL OF THE AUDIO ENGINEERING SOCIETY

Publisher

AUDIO ENGINEERING SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper