☆ 4.6 Article

Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks

IEEE ACCESS (2019)

期刊

IEEE ACCESS

卷 7, 期 -, 页码 38044-38054

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/ACCESS.2019.2904749

关键词

Dynamic sign language recognition; bi-directional LSTM; residual ConvNet; video sequence

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Telecommunications

资金

National Natural Science Foundation of China [61762061, 61663027]
Natural Science Foundation of Jiangxi Province, China [20161ACB20004, 20181BAB211019]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Sign language recognition aims to recognize meaningful movements of hand gestures and is a significant solution in intelligent communication between the deaf community and hearing societies. However, until now, the current dynamic sign language recognition methods still have some drawbacks with difficulties of recognizing complex hand gestures, low recognition accuracy for most dynamic sign language recognition, and potential problems in larger video sequence data training. In order to solve these issues, this paper presents a multimodal dynamic sign language recognition method based on a deep 3-dimensional residual ConvNet and bi-directional LSTM networks, which is named as BLSTM-3D residual network (B3D ResNet). This method consists of three main parts. First, the hand object is localized in the video frames in order to reduce the time and space complexity of network calculation. Then, the B3D ResNet automatically extracts the spatiotemporal features from the video sequences and establishes an intermediate score corresponding to each action in the video sequence after feature analysis. Finally, by classifying the video sequences, the dynamic sign language is accurately identified. The experiment is conducted on test datasets, including DEVISIGN_D dataset and SLR_Dataset. The results show that the proposed method can obtain state-of-the-art recognition accuracy (89.8% on the DEVISIGN_D dataset and 86.9% on SLR_Dataset). In addition, the B3D ResNet can effectively recognize complex hand gestures through larger video sequence data, and obtain high recognition accuracy for 500 vocabularies from Chinese hand sign language.

Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks

期刊

IEEE ACCESS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文