☆ 4.5 Article

Subspace Gaussian mixture based language modeling for large vocabulary continuous speech recognition

SPEECH COMMUNICATION (2020)

Journal

SPEECH COMMUNICATION

Volume 117, Issue -, Pages 21-27

Publisher

ELSEVIER

DOI: 10.1016/j.specom.2020.01.001

Keywords

Language modeling; Speech recognition; Recurrent neural network; Subspace Gaussian mixture model

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

This paper focuses on adaptable continuous space language modeling approach of combining longer context information of recurrent neural network (RNN) with adaptation ability of subspace Gaussian mixture model (SGMM) which has been widely used in acoustic modeling for automatic speech recognition (ASR). In large vocabulary continuous speech recognition (LVCSR) it is a challenging problem to construct language models that can capture the longer context information of words and ensure generalization and adaptation ability. Recently, language modeling based on RNN and its variants have been broadly studied in this field. The goal of our approach is to obtain the history feature vectors of a word with longer context information and model every word by subspace Gaussian mixture model such as Tandem system used in acoustic modeling for ASR. Also, it is to apply fMLLR adaptation method, which is widely used in SGMM based acoustic modeling, for adaptation of subspace Gaussian mixture based language model (SGMLM). After fMLLR adaptation, SGMLMs based on Top-Down and Bottom-Up obtain WERs of 5.70 % and 6.01%, which are better than 4.15% and 4.61% of that without adaptation, respectively. Also, with fMLLR adaptation, Top-Down and Bottom-Up based SGMLMs yield absolute word error rate reduction of 1.48%, 1.02% and a relative perplexity reduction of 10.02%, 6.46% compared to RNNLM without adaptation, respectively.

Subspace Gaussian mixture based language modeling for large vocabulary continuous speech recognition

Journal

SPEECH COMMUNICATION

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Subspace Gaussian mixture based language modeling for large vocabulary continuous speech recognition

Journal

SPEECH COMMUNICATION

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper