4.7 Article

Convolutional Neural Network-Based Block Up-Sampling for HEVC

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2018.2884203

关键词

Encoding; Image reconstruction; Image coding; Spatial resolution; High efficiency video coding; Convolutional neural network (CNN); down-sampling; high efficiency video coding (HEVC); up-sampling; video coding

资金

  1. National Program on Key Basic Research Projects (973 Program) [2015CB351803]
  2. Natural Science Foundation of China [61772483, 61390512, 61425026]

向作者/读者索取更多资源

Recently, convolutional neural network (CNN)-based methods have achieved remarkable progress in image and video super-resolution, which inspires research on down-/up-sampling-based image and video coding using CNN. Instead of hand-crafted filters for up-sampling, trained CNN models are believed to be more capable of improving image quality, thus leading to coding gain. However, previous studies either concentrated on intra-frame coding or performed down- and up-sampling of entire frame. In this paper, we introduce block-level down- and up-sampling into inter-frame coding with the help of CNN. Specifically, each block in the P or B frame can either be compressed at the original resolution or down-sampled and compressed at low resolution and then, up-sampled by the trained CNN models. Such block-level adaptivity is flexible to cope with the spatially variant texture and motion characteristics. We further investigate how to enhance the capability of CNN-based up-sampling by utilizing reference frames and study how to train the CNN models by using encoded video sequences. We implement the proposed scheme onto the high efficiency video coding (HEVC) reference software and perform a comprehensive set of experiments to evaluate our methods. The experimental results show that our scheme achieves superior performance to the HEVC anchor, especially at low bit rates, leading to an average 3.8, 2.6, and 3.5 BD-rate reduction on the HEVC common test sequences under random-access, low-delay B, and low-delay P configurations, respectively. When tested on high-definition and ultrahigh-definition sequences, the average BD-rate exceeds 5.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据