期刊
COMPUTER SPEECH AND LANGUAGE
卷 27, 期 3, 页码 874-894出版社
ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
DOI: 10.1016/j.csl.2012.07.002
关键词
Noisy data; Training; Uncertainty; Classification; Acoustic model; Gaussian mixture model; Hidden Markov model; Expectation-maximization
资金
- OSEO
- French State agency for innovation under Quaero program
We consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage remains limited to static model adaptation. We introduce a new expectation maximization (EM) based technique, which we call uncertainty training, that allows us to train Gaussian mixture models (GMMs) or hidden Markov models (HMMs) directly from noisy data with dynamic uncertainty. We evaluate the potential of this technique for a GMM-based speaker recognition task on speech data corrupted by real-world domestic background noise, using a state-of-the-art signal enhancement technique and various uncertainty estimation techniques as a front-end. Compared to conventional training, the proposed training algorithm results in 3-4% absolute improvement in speaker recognition accuracy by training from either matched, unmatched or multi-condition noisy data. This algorithm is also applicable with minor modifications to maximum a posteriori (MAP) or maximum likelihood linear regression (MLLR) acoustic model adaptation from noisy data and to other data than audio. (C) 2012 Elsevier Ltd. All rights reserved.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据