Journal
SPEECH COMMUNICATION
Volume 44, Issue 1-4, Pages 113-125Publisher
ELSEVIER
DOI: 10.1016/j.specom.2004.10.002
Keywords
blind source separation; audio-visual coherence; speech enhancement; audio-visual joint probability; spectral information
Ask authors/readers for more resources
Looking at the speaker's face is useful to hear better a speech signal and extract it from competing sources before identification. This might result in elaborating new speech enhancement or extraction techniques exploiting the audiovisual coherence of speech stimuli. In this paper, a novel algorithm plugging audio-visual coherence estimated by statistical tools on classical blind source separation algorithms is presented, and its assessment is described. We show, in the case of additive mixtures, that this algorithm performs better than classical blind tools both when there are as many sensors as sources, and when there are less sensors than sources. Audio-visual coherence enables a focus on the speech source to extract. It may also be used at the output of a classical source separation algorithm, to select the best sensor with reference to a target source. (C) 2004 Elsevier B.V. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available