☆ 4.5 Article

ShuffleNetv2-YOLOv3: a real-time recognition method of static sign language based on a lightweight network

SIGNAL IMAGE AND VIDEO PROCESSING (2023)

期刊

SIGNAL IMAGE AND VIDEO PROCESSING

卷 17, 期 6, 页码 2721-2729

出版社

SPRINGER LONDON LTD

DOI: 10.1007/s11760-023-02489-z

关键词

YOLOv3; Convolutional neural network; Gesture recognition; Object detection; Network lightweight

类别

Engineering, Electrical & Electronic Imaging Science & Photographic Technology

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Recognizing sign language quickly and accurately on embedded platforms and mobile terminals is important for meeting the communication needs of hearing impaired individuals and the general public. A lightweight model called ShuffleNetv2-YOLOv3 was proposed for static sign language recognition, which achieved a good balance between accuracy and speed. The model utilized ShuffleNetv2 as the backbone network for YOLOv3, significantly improving the recognition speed. The evaluation of the model's performance showed high F1 score and mAP values of 99.1% and 98.4%, respectively. The mobile terminal application of the lightweight model demonstrated improved inference speeds compared to the original YOLOv3 model. Overall, the ShuffleNetv2-YOLOv3 lightweight model lays a solid foundation for real-time gesture recognition on embedded platforms and mobile terminals.

To better meet the communication needs of hearing impaired people and the public, it is of great significance to recognize sign language more quickly and accurately on embedded platforms and mobile terminals. YOLOv3, raised by Joseph Redmon and Ali Farhadi in 2018, achieved a great improvement in detection speed with considerable accuracy by optimizing Yolo. However, YOLOv3 is still too bloated to use on mobile terminals. A static sign language recognition method based on the ShuffleNetv2-YOLOv3 lightweight model was proposed. The ShuffleNetv2-YOLOv3 lightweight model makes the network lightweight by using ShuffleNetv2 as the backbone network of YOLOv3. The lightweight network improved the recognition speed steeply. Combing with the CIoU loss function, the ShuffleNetv2-YOLOv3 keeps the recognition accuracy while improving the recognition speed. Recognition effectiveness of the self-made sign language images and public database by the ShuffleNetv2-YOLOv3 lightweight model was evaluated by F1 score and mAP value. The performance of the ShuffleNetv2-YOLOv3 model was compared with that of the YOLOv3-tiny, SSD, Faster-RCNN, and YOLOv4-tiny model, respectively. The experimental results show that the proposed ShuffleNetv2-YOLOv3 model achieved a good balance between the accuracy and speed of the gesture detection under the premise of model lightweight. The F1 score and mAP value of the ShuffleNetv2-YOLOv3 model were 99.1% and 98.4%, respectively. The gesture detection speed on the GPU reaches 54 frames per second, which is better than other models. The mobile terminal application of the proposed lightweight model was also evaluated. The minimal inference speed of single frame images on the CPU and GPU is 0.14 and 0.025 s per image, respectively. It is only 1/6.5 and 1/8.5 of the running speed of the original YOLOv3 model. The ShuffleNetv2-YOLOv3 lightweight model is conducive to quick, real time, and similar static sign language gesture recognition, laying a good foundation for real-time gesture recognition in the embedded platforms and mobile terminals.

ShuffleNetv2-YOLOv3: a real-time recognition method of static sign language based on a lightweight network

期刊

SIGNAL IMAGE AND VIDEO PROCESSING

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

ShuffleNetv2-YOLOv3: a real-time recognition method of static sign language based on a lightweight network

期刊

SIGNAL IMAGE AND VIDEO PROCESSING

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文