4.5 Article

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Journal

Publisher

SPRINGER HEIDELBERG
DOI: 10.1007/s13042-021-01406-5

Keywords

Blind source separation (BSS); Underdetermined speech mixtures; Signal recovery; Dictionary learning; Sparse representation

Funding

  1. Key-Area Research and Development Program of Guangdong Province [2019B010121001, 2019B010118001]
  2. National Key Research and Development Project [2018YFB1802400]
  3. National Natural Science Foundation of China [62003095, 61673124, 61973087, 61773128, 62003094, 61803094, U1911401]
  4. Natural Science Foundation of Guangdong Province [2018A 0303130080, 2019A 1515011377]

Ask authors/readers for more resources

In this paper, a novel framework is proposed to solve the underdetermined blind source separation of speech mixtures problem using a compressed sensing model. The method includes noise reduction pretreatment, blind identification for accurate mixing matrix estimation, and simultaneous updating of codewords and coefficients for dictionary selection. The approach reduces computational complexity and demonstrates superiority in experimental results.
Underdetermined blind source separation of speech mixtures is a challenging issue in the classical Cocktail-party problem. Recently, there has been attention to use dictionary learning to solve this problem. In this paper, we build a novel framework to solve the underdetermined blind separation of speech mixtures as a sparse signal recovery problem by using a compressed sensing model. First, to eliminate the influence of additive white Gaussian noise, a wavelet transform with tunable Q-factor is used as noise reduction pretreatment. Second, to obtain an accurate mixing matrix estimation, a blind identification method is designed by identifying single source data. Third, to find the best dictionary to represent the training signals, an arbitrary subset of codewords and the corresponding coefficients are updated simultaneously. In the source signal recovery stage, a block processing is used into the mixing signals so that the source components are separated from each block by using sparse representation. Then, the whole source signals are reconstructed by concatenating the separated source components from all the block. The advantage is reducing the computational complexity. Finally, experimental results by separating the underdetermined speech mixtures demonstrate the superiority of the proposed algorithm.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available