4.3 Article

Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks

Publisher

SPRINGER
DOI: 10.1007/s11265-018-1334-2

Keywords

Articulatory features; MFCC; Posterior probabilities; Deep neural network; Mandarin tone recognition; Tone-based extended recognition network

Funding

  1. Beijing Wutong Innovation Platform of Beijing Language and Culture University [16PT05]
  2. BLCU support project for young researchers program [16YCX163]
  3. China Scholarship Council

Ask authors/readers for more resources

In this paper, we investigate the effectiveness of articulatory information for Mandarin tone modeling and recognition in a deep neural network - hidden Markov model (DNN-HMM) framework. In conventional approaches, prosodic evidence (e.g., F0, duration and energy) is used to build tone classifiers, we here propose performance enhancement techniques in three areas: (i) adding articulatory features (AFs) and acoustic features, such as MFCCs (Mel frequency cepstrum coefficients), for tone modeling; (ii) adopting phone-dependent tone modeling; and (iii) using tone-based extended recognition network (ERN) to reduce the tone search space. The first approach is feature-related, it explicitly employs the AFs as a form of tonal features and is implemented through a multi-stage procedure. The second approach is model-related and directly extends to phone-dependent tone modeling so that each modeling unit (e.g., tonal phone) not only contains tone information, but also integrates the phone/articulatory information. Finally, the third technique is search-related with a phone-dependent tone-based expanding searching network. A series of comprehensive experiments is conducted using different input feature sets. It is demonstrated that (i) tone recognition accuracy is boosted by incorporating articulatory information, and (ii) ERN, attains the lowest tone error rate of 7.17%, with a 56% relative error reduction from the prosody-only baseline system error of 16.36%.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available