☆ 4.7 Article

Geometry-Guided Dense Perspective Network for Speech-Driven Facial Animation

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS (2022)

Journal

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

Volume 28, Issue 12, Pages 4873-4886

Publisher

IEEE COMPUTER SOC

DOI: 10.1109/TVCG.2021.3107669

Keywords

Three-dimensional displays; Facial animation; Solid modeling; Faces; Geometry; Correlation; Decoding; Speech-driven; 3D facial animation; geometry-guided; speaker-independent

Funding

National Natural Science Foundation of China [62171317, 62122058, 61771339]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes a Geometry-guided Dense Perspective Network (GDPnet) for achieving speaker-independent realistic 3D facial animation. The GDPnet utilizes a dense-connected encoder to strengthen feature propagation and audio feature reuse, and integrates an attention mechanism in the decoder for adaptive feature recalibration. The introduction of a non-linear face reconstruction representation improves accuracy of deformations and helps solve geometry-related issues.

Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet), to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to strengthen feature propagation and encourage the re-use of audio features, and the decoder is integrated with an attention mechanism to adaptively recalibrate point-wise feature responses by explicitly modeling interdependencies between different neuron units. We also introduce a non-linear face reconstruction representation as a guidance of latent space to obtain more accurate deformation, which helps solve the geometry-related deformation and is good for generalization across subjects. Huber and HSIC (Hilbert-Schmidt Independence Criterion) constraints are adopted to promote the robustness of our model and to better exploit the non-linear and high-order correlations. Experimental results on the public dataset and real scanned dataset validate the superiority of our proposed GDPnet compared with state-of-the-art model. The code is available for research purposes at http://cic.tju.edu.cn/faculty/likun/projects/GDPnet.

Geometry-Guided Dense Perspective Network for Speech-Driven Facial Animation

Journal

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Geometry-Guided Dense Perspective Network for Speech-Driven Facial Animation

Journal

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS

Publisher

IEEE COMPUTER SOC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper