☆ 4.6 Article

HandGCNN model for gesture recognition based voice assistance

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Volume 81, Issue 29, Pages 42353-42369

Publisher

SPRINGER

DOI: 10.1007/s11042-022-13497-5

Keywords

Gesture recognition; Virtual voice; Sign language; Deep learning; Convolution neural network

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

In this research, hand gestures are captured and processed using deep learning models to assist individuals who cannot communicate through verbal language. A hand gesture dataset HandG is created using digital cameras and image augmentation, with a novel Convolutional Neural Network (CNN) model called HandGCNN achieving a high prediction accuracy of 99.13%. A real-time system is built using a webcam as the input receptor unit to recognize gestures and generate relevant audio for impaired individuals.

Communication plays an important role in today's world. Before the evolution of the verbal communication, sign language was the only way of communication used by our ancestors. Later on, the verbal communication started evolving and different people from different region started to speak different languages. But there are some groups of people who cannot express themselves with verbal language; instead they use sign language to communicate. To bridge the gap between those people who use sign language for communication with those who use verbal language, a system is designed that recognizes the gestures of the sign language, interprets it and converts it into verbal language. Various researches have been carried out by capturing the hand signs of the speech impaired people through sensors like leap motion sensors and camera. This research works focusses on improving the gesture capturing through camera and process them through deep learning models. This work focussed on creating a hand gesture dataset HandG that includes 20,600 images for 10 classes (2060 images per category) using digital camera and image augmentation. A novel Convolution Neural Networks (CNN) based model, termed as HandGCNN, is proposed achieving a high prediction accuracy of 99.13%. A real-time system with webcam being the input receptor unit is built which recognises the signal and generates the audio relevant to that. The generated audio will serve as voice assistance for impaired people.

HandGCNN model for gesture recognition based voice assistance

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

HandGCNN model for gesture recognition based voice assistance

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Publisher

SPRINGER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper