4.7 Article

DMVC: Decomposed Motion Modeling for Learned Video Compression

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2022.3233221

Keywords

Video coding; learned video compression; video forecasting; inter prediction; motion representation; motion compensation

Ask authors/readers for more resources

In this paper, an innovative motion modeling approach is proposed by decomposing it into two components: intrinsic motion and compensatory motion. The intrinsic motion captures the implicit spatiotemporal context in the historical sequence, while the compensatory motion acts as side information for structural refinement and texture enhancement. By decomposing motion, this method addresses the questions of motion representation, compensation, and coding in the learned video compression framework.
Inter prediction is the critical component in hybrid coding framework to deal with the temporal redundancy. Most of the neural video coding methods typically follow the motion compensation based inter coding scheme, establishing motion vector (MV) as the central role. In this paper, we innovatively propose an efficient motion modeling approach by inherently decomposing it into two components, the intrinsic motion and the compensatory motion. The intrinsic motion originates from the implicit spatiotemporal context hidden in the historical sequence, which can be intuitively captured free of bits. On the top of it, the compensatory motion acts a role of structural refinement and texture enhancement as a form of side information. In particular, the inter prediction is performed in the feature space as a manner of progressive temporal transition, conditioned on the decomposed motion. By the motion decomposition paradigm, we innovatively answer the question of motion representation, compensation and coding in the learned video compression framework. With the temporal prediction, the remaining pixel residue is signaled to obtain the reconstruction. Extensive experimental results demonstrate that the proposed method achieves state-of-the-art coding performance on par with other end-to-end coding methods, and outperforms versatile video coding (VVC) under low-delay P (LDP) configuration in terms of MS-SSIM metric.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available