4.6 Article

A Deep Learning Loss Function Based on the Perceptual Evaluation of the Speech Quality

期刊

IEEE SIGNAL PROCESSING LETTERS
卷 25, 期 11, 页码 1680-1684

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LSP.2018.2871419

关键词

Deep learning; loss function; speech enhancement; PESQ; DNN

资金

  1. Spanish MINECO/FEDER Project [TEC2016-80141-P]
  2. Spanish Ministry of Education through the National Program FPU [FPU15/04161]
  3. NVIDIA Corporation

向作者/读者索取更多资源

This letter proposes a perceptual metric for speech quality evaluation, which is suitable, as a loss function, for training deep learning methods. This metric, derived from the perceptual evaluation of the speech quality algorithm, is computed in a perframe basis and from the power spectra of the reference and processed speech signal. Thus, two disturbance terms, which account for distortion once auditory masking and threshold effects are factored in, amend the mean square error (MSE) loss function by introducing perceptual criteria based on human psychoacoustics. The proposed loss function is evaluated f o r noisy speech enhancement with deep neural networks. Experimental results show that our metric achieves significant gains in speech quality (evaluated using an objective metric and a listening test) when compared to using MSE, or other perceptual-based loss functions from the literature.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据