☆ 4.7 Article

Synthesizing Obama: Learning Lip Sync from Audio

ACM TRANSACTIONS ON GRAPHICS (2017)

Journal

ACM TRANSACTIONS ON GRAPHICS

Volume 36, Issue 4, Pages -

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3072959.3073640

Keywords

Audio; Face Synthesis; LSTM; RNN; Pig data. Videos; Audiovisual Speech; Uncanny Valley; Lip Sync

Funding

Samsung
Google
Intel
University of Washington Animation Research Labs

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track. Our approach produces photorealistic results.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Synthesizing Obama: Learning Lip Sync from Audio

Journal

ACM TRANSACTIONS ON GRAPHICS

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Synthesizing Obama: Learning Lip Sync from Audio

Journal

ACM TRANSACTIONS ON GRAPHICS

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper