☆ 3.8 Proceedings Paper

A UNIFIED DEEP MODELING APPROACH TO SIMULTANEOUS SPEECH DEREVERBERATION AND RECOGNITION FOR THE REVERB CHALLENGE

2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017) (2017)

期刊

2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017)

卷 -, 期 -, 页码 36-40

出版社

IEEE

关键词

Signal space robustness; deep modeling; reverberant speech enhancement; robust speech recognition

类别

Engineering, Electrical & Electronic Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

We propose a unified deep neural network (DNN) approach to achieve both high-quality enhanced speech and high-accuracy automatic speech recognition (ASR) simultaneously on the recent REverberant Voice Enhancement and Recognition Benchmark (RE-VERB) Challenge. These two goals are accomplished by two proposed techniques, namely DNN-based regression to enhance reverberant and noisy speech, followed by DNN-based multi-condition training that takes clean-condition, multi-condition and enhanced speech all into consideration. We first report on superior objective measures in enhanced speech to those listed in the 2014 REVERB Challenge Workshop. We then show that in clean-condition training, we obtain the best word error rate (WER) of 13.28% on the 1-channel REVERB simulated evaluation data with the proposed DNN-based pre-processing scheme. Similarly we attain a competitive single-system WER of 8.75% with the proposed multi-condition training strategy and the same less-discriminative log power spectrum features used in the enhancement stage. Finally by leveraging upon joint training with more discriminative ASR features and improved neural network based language models a state-of-the-art WER of 4.46% is attained with a single ASR system, and single-channel information. Another state-of-the-art WER of 4.10% is achieved through system combination.

A UNIFIED DEEP MODELING APPROACH TO SIMULTANEOUS SPEECH DEREVERBERATION AND RECOGNITION FOR THE REVERB CHALLENGE

期刊

2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017)

出版社

IEEE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A UNIFIED DEEP MODELING APPROACH TO SIMULTANEOUS SPEECH DEREVERBERATION AND RECOGNITION FOR THE REVERB CHALLENGE

期刊

2017 HANDS-FREE SPEECH COMMUNICATIONS AND MICROPHONE ARRAYS (HSCMA 2017)

出版社

IEEE

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文