☆ 4.7 Article

Learning dynamic audio-visual mapping with input-output hidden Markov models

IEEE TRANSACTIONS ON MULTIMEDIA (2006)

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Volume 8, Issue 3, Pages 542-549

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2006.870732

Keywords

animation; audio-visual mapping; HMM; IOHMM; learning

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In this paper, we formulate the problem of synthesizing facial animation from an input audio sequence as a dynamic audio-visual mapping. We propose that audio-visual mapping should be modeled with an input-output hidden Markov model, or IOHMM. An IOHMM is an HMM for which the output and transition probabilities are conditional on the input sequence. We train IOHMMs using the expectation-maximization (EM) algorithm with a novel architecture to explicitly model the relationship between transition probabilities and the input using neural networks. Given an input sequence, the output sequence is synthesized by the maximum likelihood estimation. Experimental results demonstrate that IOHMMs can generate natural and good-quality facial animation sequences from the input audio.

Learning dynamic audio-visual mapping with input-output hidden Markov models

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Learning dynamic audio-visual mapping with input-output hidden Markov models

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper