☆ 4.6 Article

Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2020)

期刊

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

卷 28, 期 -, 页码 380-390

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TASLP.2019.2955276

关键词

Speech enhancement; Spectrogram; Noise measurement; Convolution; Time-frequency analysis; Training; Complex spectral mapping; gated convolutional recurrent network; phase estimation; monaural speech enhancement

类别

Acoustics Engineering, Electrical & Electronic

资金

National Institute on Deafness and Other Communication Disorders [R01 DC012048, R01 DC015521]
Ohio Supercomputer Center

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Phase is important for perceptual quality of speech. However, it seems intractable to directly estimate phase spectra through supervised learning due to their lack of spectrotemporal structure in it. Complex spectral mapping aims to estimate the real and imaginary spectrograms of clean speech from those of noisy speech, which simultaneously enhances magnitude and phase responses of speech. Inspired by multi-task learning, we propose a gated convolutional recurrent network (GCRN) for complex spectral mapping, which amounts to a causal system for monaural speech enhancement. Our experimental results suggest that the proposed GCRN substantially outperforms an existing convolutional neural network (CNN) for complex spectral mapping in terms of both objective speech intelligibility and quality. Moreover, the proposed approach yields significantly higher STOI and PESQ than magnitude spectral mapping and complex ratio masking. We also find that complex spectral mapping with the proposed GCRN provides an effective phase estimate.

Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement

期刊

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement

期刊

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文