3.8 Article

Identification of Indian languages using multi-level spectral and prosodic features

Journal

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY
Volume 16, Issue 4, Pages 489-511

Publisher

SPRINGER
DOI: 10.1007/s10772-013-9198-0

Keywords

Language recognition; Language identification; Intonation; Rhythm; Stress; Prosodic features; Gaussian mixture models

Ask authors/readers for more resources

In this paper spectral and prosodic features extracted from different levels are explored for analyzing the language specific information present in speech. In this work, spectral features extracted from frames of 20 ms (block processing), individual pitch cycles (pitch synchronous analysis) and glottal closure regions are used for discriminating the languages. Prosodic features extracted from syllable, tri-syllable and multi-word (phrase) levels are proposed in addition to spectral features for capturing the language specific information. In this study, language specific prosody is represented by intonation, rhythm and stress features at syllable and tri-syllable (words) levels, whereas temporal variations in fundamental frequency (F-0 contour), durations of syllables and temporal variations in intensities (energy contour) are used to represent the prosody at multi-word (phrase) level. For analyzing the language specific information in the proposed features, Indian language speech database (IITKGP-MLILSC) is used. Gaussian mixture models are used to capture the language specific information from the proposed features. The evaluation results indicate that language identification performance is improved with combination of features. Performance of proposed features is also analyzed on standard Oregon Graduate Institute Multi-Language Telephone-based Speech (OGI-MLTS) database.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available