☆ 3.8 Proceedings Paper

Skeletor: Skeletal Transformers for Robust Body-Pose Estimation

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021 (2021)

Journal

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021

Volume -, Issue -, Pages 3389-3397

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/CVPRW53098.2021.00378

Keywords

Funding

SNSF Sinergia project 'SMILE II' [CRSII5 193686]
European Union [101016982]
EPSRC [EP/R03298X/1]
Swiss National Science Foundation (SNF) [CRSII5_193686] Funding Source: Swiss National Science Foundation (SNF)

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The paper introduces a novel transformer-based network, Skeletor, that can unsupervisedly learn the distribution of 3D pose and motion to reduce inaccuracies and inconsistencies in skeletal estimation. Skeletor uses strong priors learned from 25 million frames to smooth and correct skeleton sequences, achieving improved performance on 3D human pose estimation.

Predicting 3D human pose from a single monoscopic video can be highly challenging due to factors such as low resolution, motion blur and occlusion, in addition to the fundamental ambiguity in estimating 3D from 2D. Approaches that directly regress the 3D pose from independent images can be particularly susceptible to these factors and result in jitter, noise and/or inconsistencies in skeletal estimation. Much of which can be overcome if the temporal evolution of the scene and skeleton are taken into account. However, rather than tracking body parts and trying to temporally smooth them, we propose a novel transformer based network that can learn a distribution over both pose and motion in an unsupervised fashion. We call our approach Skeletor. Skeletor overcomes inaccuracies in detection and corrects partial or entire skeleton corruption. Skeletor uses strong priors learn from on 25 million frames to correct skeleton sequences smoothly and consistently. Skeletor can achieve this as it implicitly learns the spatio-temporal context of human motion via a transformer based neural network. Extensive experiments show that Skeletor achieves improved performance on 3D human pose estimation and further provides benefits for downstream tasks such as sign language translation.

Skeletor: Skeletal Transformers for Robust Body-Pose Estimation

Journal

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Skeletor: Skeletal Transformers for Robust Body-Pose Estimation

Journal

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper