期刊
IETE TECHNICAL REVIEW
卷 39, 期 3, 页码 623-634出版社
TAYLOR & FRANCIS LTD
DOI: 10.1080/02564602.2021.1886610
关键词
Binary mask; Classification; Noise suppression; Speech intelligibility; Speech signal processing; Time-frequency mask
In the past decade, time-frequency masking techniques have been explored to improve speech intelligibility in noise, with binary and soft masking methods being the main approaches. Binary masking retains time-frequency units where target speech is stronger than noise, while soft masking can take any value between 0 and 1, closely related to the Wiener filter gain in the frequency domain.
Over the last decade, time-frequency masking techniques have been explored to achieve substantial improvement of speech intelligibility in noise. Binary or soft mask can be applied to the noisy speech for speech separation. Binary masking approach retains the time-frequency (T-F) units of the noise-corrupted signal where the target speech is stronger than the interfering noise, and removes the T-F units where the interfering noise is dominant. While binary mask is 0 or 1, soft mask can take any value mostly in the range from 0 to 1, and is closely related to the frequency domain Wiener filter gain. Motivated by intelligibility studies of speech synthesized using the ideal binary (or soft) mask, a number of subsequent researches on estimating T-F mask have been conducted for practical use. This paper reviews the T-F masking strategies, covering the definition, preliminary studies with ideal mask, and the estimation of mask in practice.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据