期刊
IEEE SIGNAL PROCESSING LETTERS
卷 25, 期 11, 页码 1680-1684出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LSP.2018.2871419
关键词
Deep learning; loss function; speech enhancement; PESQ; DNN
资金
- Spanish MINECO/FEDER Project [TEC2016-80141-P]
- Spanish Ministry of Education through the National Program FPU [FPU15/04161]
- NVIDIA Corporation
This letter proposes a perceptual metric for speech quality evaluation, which is suitable, as a loss function, for training deep learning methods. This metric, derived from the perceptual evaluation of the speech quality algorithm, is computed in a perframe basis and from the power spectra of the reference and processed speech signal. Thus, two disturbance terms, which account for distortion once auditory masking and threshold effects are factored in, amend the mean square error (MSE) loss function by introducing perceptual criteria based on human psychoacoustics. The proposed loss function is evaluated f o r noisy speech enhancement with deep neural networks. Experimental results show that our metric achieves significant gains in speech quality (evaluated using an objective metric and a listening test) when compared to using MSE, or other perceptual-based loss functions from the literature.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据