4.6 Article

3D Hand Pose Estimation via Graph-Based Reasoning

Journal

IEEE ACCESS
Volume 9, Issue -, Pages 35824-35833

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2021.3061716

Keywords

Feature extraction; Cognition; Pose estimation; Thumb; Task analysis; Three-dimensional displays; Two dimensional displays; 3D hand pose estimation; depth image; graph convolutional network

Funding

  1. Ministry of Culture, Sports and Tourism
  2. Korea Creative Content Agency [R2020040058]
  3. MSIT(Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program [IITP-2021-2018-0-01421]

Ask authors/readers for more resources

Hand pose estimation from a single depth image has attracted significant attention recently due to its importance in various applications involving human-computer interaction. The proposed CNN-based approach incorporates hand joint connections to features through both global and local relation inference, outperforming previous state-of-the-art methods on public datasets. The method also boasts real-time application potential with an execution speed of 103 fps in a single GPU environment.
Hand pose estimation from a single depth image has recently received significant attention owing to its importance in many applications requiring human-computer interaction. The rapid progress of convolutional neural networks (CNNs) and technological advances in low-cost depth cameras have greatly improved the performance of the hand pose estimation method. Nevertheless, regressing joint coordinates is still a challenging task due to joint flexibility and self-occlusion. Previous hand pose estimation methods have limitations in relying on a deep and complex network structure without fully utilizing hand joint connections. A hand is an articulated object and consists of six parts that represent the palm and five fingers. The kinematic constraints can be obtained by modeling the dependency between adjacent joints. This paper proposes a novel CNN-based approach incorporating hand joint connections to features through both a global relation inference for the entire hand and local relation inference for each finger. Modeling the relations between the hand joints can alleviate critical problems for occlusion and self-similarity. We also present a hierarchical structure with six branches that independently estimate the position of the palm and five fingers by adding hand connections of each joint using graph reasoning based on graph convolutional networks. Experimental results on public hand pose datasets show that the proposed method outperforms previous state-of-the-art methods. Specifically, our method achieves the best accuracy compared to state-of-the-art methods on public datasets. In addition, the proposed method can be utilized for real-time applications with an execution speed of 103 fps in a single GPU environment.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available