☆ 4.7 Article

Lip reading with Hahn Convolutional Neural Networks

IMAGE AND VISION COMPUTING (2019)

期刊

IMAGE AND VISION COMPUTING

卷 88, 期 -, 页码 76-83

出版社

ELSEVIER

DOI: 10.1016/j.imavis.2019.04.010

关键词

Visual speech recognition; Lipreading; Laryngectomy; Hahn moments; Convolutional Neural Networks

类别

Computer Science, Artificial Intelligence Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic Optics

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Lipreading or Visual speech recognition is the process of decoding speech from speaker's mouth movements. It is used for people with hearing impairment, to understand patients attained with laryngeal cancer, people with vocal cord paralysis and in noisy environment. In this paper we aim to develop a visual-only speech recognition system based only on video. Our main targeted application is in the medical field for the assistance to laryngectomized persons. To that end, we propose Hahn Convolutional Neural Network (HCNN), a novel architecture based on Hahn moments as first layer in the Convolutional Neural Network (CNN) architecture. We show that HCNN helps in reducing the dimensionality of video images, in gaining training time. HCNN model is trained to classify letters, digits or words given as video images. We evaluated the proposed method on three datasets, AVLetters, OuluVS2 and BBC LRW, and we show that it achieves significant results in comparison with other works in the literature. (C) 2019 Elsevier B.V. All rights reserved.

Lip reading with Hahn Convolutional Neural Networks

期刊

IMAGE AND VISION COMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Lip reading with Hahn Convolutional Neural Networks

期刊

IMAGE AND VISION COMPUTING

出版社

ELSEVIER

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文