4.6 Article

Multiple hypotheses based motion compensation for learned video compression

期刊

NEUROCOMPUTING
卷 548, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2023.126396

关键词

Learned video compression; Multiple hypotheses; Motion compensation; Motion estimation; Temporal alignment

向作者/读者索取更多资源

Recently, there has been significant research attention on learned video compression. However, existing methods use a single hypothesis for motion alignment, leading to inaccurate motion estimation, especially for complex scenes. Inspired by the multiple hypotheses philosophy, we propose a multiple hypotheses based motion compensation approach to enhance efficiency by providing diverse hypotheses. We also introduce a hypotheses attention module and utilize context combination to fuse weighted hypotheses and generate effective contexts for compression.
Recently, learned video compression has attracted copious research attention. However, among the exist-ing methods, the motion used for alignment is limited to one hypothesis only, leading to inaccurate mo-tion estimation, especially for the complicated scenes with complex movements. Motivated by multiple hypotheses philosophy in traditional video compression, we develop the multiple hypotheses based mo-tion compensation for the learned video compression, in an effort to enhance the motion compensation efficiency by providing diverse hypotheses with efficient temporal information fusion. In particular, the multiple hypotheses module which produces multiple motions and warped features for mining sufficient temporal information, is proposed to provide various hypotheses inferences from the reference frame. To utilize these hypotheses more copiously, the hypotheses attention module is adopted by introducing the channel-wised squeeze-and-excitation layer and the multi-scale network. In addition, the context com-bination is employed to fuse the weighted hypotheses to generate effective contexts with powerful tem-poral priors. Finally, the valid contexts are used for promoting the compression efficiency by merging weighted warped features. Extensive experiments show that the proposed method can significantly im-prove the rate-distortion performance of learned video compression. Compared with the state-of-the-art method for end-to-end video compression, over 13% bit rate reductions on average in terms of PSNR and MS-SSIM can be achieved.& COPY; 2023 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据