4.2 Review

Review of Time-Frequency Masking Approach for Improving Speech Intelligibility in Noise

期刊

IETE TECHNICAL REVIEW
卷 39, 期 3, 页码 623-634

出版社

TAYLOR & FRANCIS LTD
DOI: 10.1080/02564602.2021.1886610

关键词

Binary mask; Classification; Noise suppression; Speech intelligibility; Speech signal processing; Time-frequency mask

向作者/读者索取更多资源

In the past decade, time-frequency masking techniques have been explored to improve speech intelligibility in noise, with binary and soft masking methods being the main approaches. Binary masking retains time-frequency units where target speech is stronger than noise, while soft masking can take any value between 0 and 1, closely related to the Wiener filter gain in the frequency domain.
Over the last decade, time-frequency masking techniques have been explored to achieve substantial improvement of speech intelligibility in noise. Binary or soft mask can be applied to the noisy speech for speech separation. Binary masking approach retains the time-frequency (T-F) units of the noise-corrupted signal where the target speech is stronger than the interfering noise, and removes the T-F units where the interfering noise is dominant. While binary mask is 0 or 1, soft mask can take any value mostly in the range from 0 to 1, and is closely related to the frequency domain Wiener filter gain. Motivated by intelligibility studies of speech synthesized using the ideal binary (or soft) mask, a number of subsequent researches on estimating T-F mask have been conducted for practical use. This paper reviews the T-F masking strategies, covering the definition, preliminary studies with ideal mask, and the estimation of mask in practice.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据