☆ 4.5 Article

Blind Speech Separation and Dereverberation using neural beamforming

SPEECH COMMUNICATION (2022)

期刊

SPEECH COMMUNICATION

卷 140, 期 -, 页码 29-41

出版社

ELSEVIER

DOI: 10.1016/j.specom.2022.03.004

关键词

Multi-channel speaker separation; Beamforming; Dereverberation; Speaker identification; Triplet mining

类别

Acoustics Computer Science, Interdisciplinary Applications

资金

Austrian Science Fund (FWF) [P27803-N15]
Austrian Science Fund (FWF) [P27803] Funding Source: Austrian Science Fund (FWF)

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper introduces the BSSD network, which achieves speaker separation, dereverberation, and speaker identification simultaneously. Various techniques like predefined spatial cues, neural beamforming, embedding vectors, and triplet mining are utilized for these tasks. The system is evaluated based on SI-SDR, WER, and EER metrics.

In this paper, we present the Blind Speech Separation and Dereverberation (BSSD) network, which performs simultaneous speaker separation, dereverberation and speaker identification in a single neural network. Speaker separation is guided by a set of predefined spatial cues. Dereverberation is performed by using neural beamforming, and speaker identification is aided by embedding vectors and triplet mining. We introduce a frequency-domain model which uses complex-valued neural networks, and a time-domain variant which performs beamforming in latent space. Further, we propose a block-online mode to process longer audio recordings, as they occur in meeting scenarios. We evaluate our system in terms of Scale Independent Signal to Distortion Ratio (SI-SDR), Word Error Rate (WER) and Equal Error Rate (EER).

Blind Speech Separation and Dereverberation using neural beamforming

期刊

SPEECH COMMUNICATION

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Blind Speech Separation and Dereverberation using neural beamforming

期刊

SPEECH COMMUNICATION

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文