4.5 Article

Speech emotion recognition using hidden Markov models

期刊

SPEECH COMMUNICATION
卷 41, 期 4, 页码 603-623

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/S0167-6393(03)00099-2

关键词

recognition of emotion; emotional speech; log frequency power coefficients; hidden Markov model; human communication

向作者/读者索取更多资源

In emotion classification of speech signals, the popular features employed are statistics of fundamental frequency, energy contour, duration of silence and voice quality. However, the performance of systems employing these features degrades substantially when more than two categories of emotion are to be classified. In this paper, a text independent method of emotion classification of speech is proposed. The proposed method makes use of short time log frequency power coefficients (LFPC) to represent the speech signals and a discrete hidden Markov model (HMM) as the classifier. The emotions are classified into six categories. The category labels used are, the archetypal emotions of Anger, Disgust, Fear, Joy, Sadness and Surprise. A database consisting of 60 emotional utterances, each from twelve speakers is constructed and used to train and test the proposed system. Performance of the LFPC feature parameters is compared with that of the linear prediction Cepstral coefficients (LPCC) and mel-frequency Cepstral coefficients (MFCC) feature parameters commonly used in speech recognition systems. Results show that the proposed system yields an average accuracy of 78% and the best accuracy of 96% in the classification of six emotions. This is beyond the 17% chances by a random hit for a sample set of 6 categories. Results also reveal that LFPC is a better choice as feature parameters for emotion classification than the traditional feature parameters. (C) 2003 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据