☆ 4.5 Article

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS (2021)

Journal

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

Volume 12, Issue 12, Pages 3573-3583

Publisher

SPRINGER HEIDELBERG

DOI: 10.1007/s13042-021-01406-5

Keywords

Blind source separation (BSS); Underdetermined speech mixtures; Signal recovery; Dictionary learning; Sparse representation

Funding

Key-Area Research and Development Program of Guangdong Province [2019B010121001, 2019B010118001]
National Key Research and Development Project [2018YFB1802400]
National Natural Science Foundation of China [62003095, 61673124, 61973087, 61773128, 62003094, 61803094, U1911401]
Natural Science Foundation of Guangdong Province [2018A 0303130080, 2019A 1515011377]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

In this paper, a novel framework is proposed to solve the underdetermined blind source separation of speech mixtures problem using a compressed sensing model. The method includes noise reduction pretreatment, blind identification for accurate mixing matrix estimation, and simultaneous updating of codewords and coefficients for dictionary selection. The approach reduces computational complexity and demonstrates superiority in experimental results.

Underdetermined blind source separation of speech mixtures is a challenging issue in the classical Cocktail-party problem. Recently, there has been attention to use dictionary learning to solve this problem. In this paper, we build a novel framework to solve the underdetermined blind separation of speech mixtures as a sparse signal recovery problem by using a compressed sensing model. First, to eliminate the influence of additive white Gaussian noise, a wavelet transform with tunable Q-factor is used as noise reduction pretreatment. Second, to obtain an accurate mixing matrix estimation, a blind identification method is designed by identifying single source data. Third, to find the best dictionary to represent the training signals, an arbitrary subset of codewords and the corresponding coefficients are updated simultaneously. In the source signal recovery stage, a block processing is used into the mixing signals so that the source components are separated from each block by using sparse representation. Then, the whole source signals are reconstructed by concatenating the separated source components from all the block. The advantage is reducing the computational complexity. Finally, experimental results by separating the underdetermined speech mixtures demonstrate the superiority of the proposed algorithm.

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Journal

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Journal

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

Publisher

SPRINGER HEIDELBERG

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper