☆ 4.8 Article

FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

期刊

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

卷 45, 期 4, 页码 4569-4585

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TPAMI.2022.3210652

关键词

Image coding; Video compression; Encoding; Spatial resolution; Motion estimation; Motion compensation; Feature extraction; Deformable convolution; neural network; resolution-adaptive coding; video compression

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this work, a feature-space video coding framework (FVC) is proposed to perform all major operations in the feature space, including motion estimation, motion compression, motion compensation, and residual compression. The framework also includes two new modules, resolution-adaptive motion coding (RaMC) and resolution-adaptive residual coding (RaRC), for handling different types of motion and residual patterns at different spatial locations. Experimental results demonstrate that the proposed framework achieves state-of-the-art performance on benchmark datasets.

Deep video compression is attracting increasing attention from both deep learning and video processing community. Recent learning-based approaches follow the hybrid coding paradigm to perform pixel space operations for reducing redundancy along both spatial and temporal dimentions, which leads to inaccurate motion estimation or less effective motion compensation. In this work, we propose a feature-space video coding framework (FVC), which performs all major operations (i.e., motion estimation, motion compression, motion compensation and residual compression) in the feature space. Specifically, a new deformable compensation module, which consists of motion estimation, motion compression and motion compensation, is proposed for more effective motion compensation. In our deformable compensation module, we first perform motion estimation in the feature space to produce the motion information (i.e., the offset maps). Then the motion information is compressed by using the auto-encoder style network. After that, we use the deformable convolution operation to generate the predicted feature for motion compensation. Finally, the residual information between the feature from the current frame and the predicted feature from the deformable compensation module is also compressed in the feature space. Motivated by the conventional codecs, in which the blocks with different sizes are used for motion estimation, we additionally propose two new modules called resolution-adaptive motion coding (RaMC) and resolution-adaptive residual coding (RaRC) to automatically cope with different types of motion and residual patterns at different spatial locations. Comprehensive experimental results demonstrate that our proposed framework achieves the state-of-the-art performance on three benchmark datasets including HEVC, UVG and MCL-JCV.

FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space

期刊

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

FVC: An End-to-End Framework Towards Deep Video Compression in Feature Space

期刊

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

出版社

IEEE COMPUTER SOC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文