Journal
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021
Volume -, Issue -, Pages 3389-3397Publisher
IEEE COMPUTER SOC
DOI: 10.1109/CVPRW53098.2021.00378
Keywords
-
Categories
Funding
- SNSF Sinergia project 'SMILE II' [CRSII5 193686]
- European Union [101016982]
- EPSRC [EP/R03298X/1]
- Swiss National Science Foundation (SNF) [CRSII5_193686] Funding Source: Swiss National Science Foundation (SNF)
Ask authors/readers for more resources
The paper introduces a novel transformer-based network, Skeletor, that can unsupervisedly learn the distribution of 3D pose and motion to reduce inaccuracies and inconsistencies in skeletal estimation. Skeletor uses strong priors learned from 25 million frames to smooth and correct skeleton sequences, achieving improved performance on 3D human pose estimation.
Predicting 3D human pose from a single monoscopic video can be highly challenging due to factors such as low resolution, motion blur and occlusion, in addition to the fundamental ambiguity in estimating 3D from 2D. Approaches that directly regress the 3D pose from independent images can be particularly susceptible to these factors and result in jitter, noise and/or inconsistencies in skeletal estimation. Much of which can be overcome if the temporal evolution of the scene and skeleton are taken into account. However, rather than tracking body parts and trying to temporally smooth them, we propose a novel transformer based network that can learn a distribution over both pose and motion in an unsupervised fashion. We call our approach Skeletor. Skeletor overcomes inaccuracies in detection and corrects partial or entire skeleton corruption. Skeletor uses strong priors learn from on 25 million frames to correct skeleton sequences smoothly and consistently. Skeletor can achieve this as it implicitly learns the spatio-temporal context of human motion via a transformer based neural network. Extensive experiments show that Skeletor achieves improved performance on 3D human pose estimation and further provides benefits for downstream tasks such as sign language translation.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available