☆ 4.7 Article

Deep Head Pose: Gaze-Direction Estimation in Multimodal Video

IEEE TRANSACTIONS ON MULTIMEDIA (2015)

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Volume 17, Issue 11, Pages 2094-2107

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2015.2482819

Keywords

Convolutional neural networks (CNNs); deep learning; gaze direction; head-pose; RGB-D

Funding

Engineering and Physical Sciences Research Council (EPSRC) [EP/K014277/1]
MOD University Defence Research Collaboration in Signal Processing
EPSRC [EP/K014277/1] Funding Source: UKRI
Engineering and Physical Sciences Research Council [EP/K014277/1] Funding Source: researchfish

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In this paper we present a convolutional neural network (CNN)-based model for human head pose estimation in low-resolution multi-modal RGB-D data. We pose the problem as one of classification of human gazing direction. We further fine-tune a regressor based on the learned deep classifier. Next we combine the two models (classification and regression) to estimate approximate regression confidence. We present state-of-the-art results in datasets that span the range of high-resolution human robot interaction (close up faces plus depth information) data to challenging low resolution outdoor surveillance data. We build upon our robust head-pose estimation and further introduce a new visual attention model to recover interaction with the environment. Using this probabilistic model, we show that many higher level scene understanding like human-human/scene interaction detection can be achieved. Our solution runs in real-time on commercial hardware.

Deep Head Pose: Gaze-Direction Estimation in Multimodal Video

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Deep Head Pose: Gaze-Direction Estimation in Multimodal Video

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper