☆ 4.5 Article

Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition

SPEECH COMMUNICATION (2011)

Journal

SPEECH COMMUNICATION

Volume 53, Issue 5, Pages 753-767

Publisher

ELSEVIER

DOI: 10.1016/j.specom.2010.07.002

Keywords

Spectro-temporal feature extraction; Automatic speech recognition; Robustness; Intrinsic variability

Funding

DFG [SFB/TRR 31]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The effect of bio-inspired spectro-temporal processing for automatic speech recognition (ASR) is analyzed for two different tasks with focus on the robustness of spectro-temporal Gabor features in comparison to mel-frequency cepstral coefficients (MFCCs). Experiments aiming at extrinsic factors such as additive noise and changes of the transmission channel were carried out on a digit classification task (AURORA 2) for which spectro-temporal features were found to be more robust than the MFCC baseline against a wide range of noise sources. Intrinsic variations, i.e., changes in speaking rate, speaking effort and pitch, were analyzed on a phoneme recognition task with matched training and test conditions. The sensitivity of Gabor and MFCC features against various speaking styles was found to be different in a systematic way. An analysis based on phoneme confusions for both feature types suggests that spectro-temporal and purely spectral features carry complementary information. The usefulness of the combined information was demonstrated in a system using a combination of both types of features which yields a decrease in word-error rate of 16% compared to the best single-stream recognizer and 47% compared to an MFCC baseline. (C) 2010 Elsevier B.V. All rights reserved.

Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition

Journal

SPEECH COMMUNICATION

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition

Journal

SPEECH COMMUNICATION

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper