期刊
ACM TRANSACTIONS ON GRAPHICS
卷 36, 期 4, 页码 -出版社
ASSOC COMPUTING MACHINERY
DOI: 10.1145/3072959.3073640
关键词
Audio; Face Synthesis; LSTM; RNN; Pig data. Videos; Audiovisual Speech; Uncanny Valley; Lip Sync
资金
- Samsung
- Intel
- University of Washington Animation Research Labs
Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track. Our approach produces photorealistic results.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据