☆ 4.6 Article

Multi-Target Extractor and Detector for Unknown-Number Speaker Diarization

IEEE SIGNAL PROCESSING LETTERS (2023)

期刊

IEEE SIGNAL PROCESSING LETTERS

卷 30, 期 -, 页码 638-642

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/LSP.2023.3279781

关键词

Detectors; Training; Feature extraction; Mixers; Hidden Markov models; Oral communication; Data mining; Speaker diarization; speaker representations

类别

Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This study proposes a neural architecture that extracts speaker representations and detects the presence of each speaker on a frame-by-frame basis, regardless of the number of speakers in a conversation. The model outperforms previous methods in tests on the CALLHOME corpus and achieves significant diarization error rate reductions in a more challenging case with simultaneous speakers ranging from 2 to 7.

Strong representations of target speakers can help extract important information about speakers and detect corresponding temporal regions in multi-speaker conversations. In this study, we propose a neural architecture that simultaneously extracts speaker representations consistent with the speaker diarization objective and detects the presence of each speaker on a frame-by-frame basis regardless of the number of speakers in a conversation. A speaker representation (called z-vector) extractor and a time-speaker contextualizer, implemented by a residual network and processing data in both temporal and speaker dimensions, are integrated into a unified framework. Tests on the CALLHOME corpus show that our model outperforms most of the methods proposed so far. Evaluations in a more challenging case with simultaneous speakers ranging from 2 to 7 show that our model achieves 6.4% to 30.9% relative diarization error rate reductions over several typical baselines.

Multi-Target Extractor and Detector for Unknown-Number Speaker Diarization

期刊

IEEE SIGNAL PROCESSING LETTERS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-Target Extractor and Detector for Unknown-Number Speaker Diarization

期刊

IEEE SIGNAL PROCESSING LETTERS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文