☆ 4.2 Article

Model-Based Expectation-Maximization Source Separation and Localization

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2010)

期刊

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

卷 18, 期 2, 页码 382-394

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TASL.2009.2029711

关键词

Maximum-likelihood estimation; speech enhancement; time-frequency masking; underdetermined source separation

类别

Acoustics Engineering, Electrical & Electronic

资金

Fu Foundation School of Engineering and Applied Science
National Science Foundation (NSF) [IIS-0535168]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

This paper describes a system, referred to as model-based expectation-maximization source separation and localization (MESSL), for separating and localizing multiple sound sources from an underdetermined reverberant two-channel recording. By clustering individual spectrogram points based on their interaural phase and level differences, MESSL generates masks that can be used to isolate individual sound sources. We first describe a probabilistic model of interaural parameters that can be evaluated at individual spectrogram points. By creating a mixture of these models over sources and delays, the multi-source localization problem is reduced to a collection of single source problems. We derive an expectation-maximization algorithm for computing the maximum-likelihood parameters of this mixture model, and show that these parameters correspond well with interaural parameters measured in isolation. As a byproduct of fitting this mixture model, the algorithm creates probabilistic spectrogram masks that can be used for source separation. In simulated anechoic and reverberant environments, separations using MESSL produced on average a signal-to-distortion ratio 1.6 dB greater and Perceptual Evaluation of Speech Quality (PESQ) results 0.27 mean opinion score units greater than four comparable algorithms.

Model-Based Expectation-Maximization Source Separation and Localization

期刊

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Model-Based Expectation-Maximization Source Separation and Localization

期刊

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文