☆ 4.5 Article

Audio-visual enhancement of speech in noise

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA (2001)

期刊

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA

卷 109, 期 6, 页码 3007-3020

出版社

ACOUSTICAL SOC AMER AMER INST PHYSICS

DOI: 10.1121/1.1358887

关键词

类别

Acoustics Audiology & Speech-Language Pathology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

A key problem for telecommunication or human-machine communication systems concerns speech enhancement in noise. In this domain, a certain number of techniques exist, all of them based on an acoustic-only approach-that is, the processing of the audio corrupted signal using audio information (from the corrupted signal only or additive audio information). In this paper, an audio-visual approach to the problem is considered, since it has been demonstrated in several studies that viewing the speaker's face improves message intelligibility, especially in noisy environments. A speech enhancement prototype system that takes advantage of visual inputs is developed. A filtering process approach is proposed that uses enhancement filters estimated with the help of Lip shape information. The estimation process is based on linear regression or simple neural networks using a training corpus. A set of experiments assessed by Gaussian classification and perceptual tests demonstrates that it is indeed possible to enhance simple stimuli (vowel-plosive-vowel sequences) embedded in white Gaussian noise. (C) 2001 Acoustical Society of America.

Audio-visual enhancement of speech in noise

期刊

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA

出版社

ACOUSTICAL SOC AMER AMER INST PHYSICS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Audio-visual enhancement of speech in noise

期刊

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA

出版社

ACOUSTICAL SOC AMER AMER INST PHYSICS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文