☆ 3.8 Article

Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING (2002)

Journal

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Volume 2002, Issue 11, Pages 1165-1173

Publisher

HINDAWI LTD

DOI: 10.1155/S1110865702207015

Keywords

blind source separation; lipreading; audio-visual speech processing

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

We present a new approach to the source separation problem in the case of multiple speech signals. The method is based on the use of automatic lipreading, the objective is to extract an acoustic speech signal from other acoustic signals by exploiting its coherence with the speaker's lip movements. We consider the case of an additive stationary mixture of decorrelated sources, with no further assumptions on independence or non-Gaussian character. Firstly, we present a theoretical framework showing that. it is indeed, possible to separate a source when some of its spectral characteristics are provided to the system. Then we address the case of audiovisual sources. We show how, if a statistical model of the joint probability of visual and spectral audio input is learnt to quantify the audio-visual coherence, separation can be achieved by maximizing this probability. Finally, we present a number of separation results on a corpus of vowel-plosive-vowel sequences uttered by a single speaker, embedded in a mixture of other voices. We show that separation can be quite good for mixtures of 2, 3, and 5 sources. These results, while very preliminary, are encouraging, and are discussed in respect to their potential complementarity with traditional pure audio separation or enhancement techniques.

Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli

Journal

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Publisher

HINDAWI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Separation of audio-visual speech sources: A new approach exploiting the audio-visual coherence of speech stimuli

Journal

EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING

Publisher

HINDAWI LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper