期刊
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
卷 32, 期 7, 页码 4224-4237出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCSVT.2021.3128275
关键词
Image coding; Measurement; Transform coding; Image restoration; Distortion; Video recording; Quality assessment; Video perceptual quality enhancement; wavelet packet transform; generative adversarial network
资金
- NSFC [62050175, 61922009, 61876013, 62001016]
- Beijing Natural Science Foundation [JQ20020]
This paper focuses on enhancing the perceptual quality of compressed video and proposes a novel generative adversarial network based on multi-level wavelet packet transform to exploit high-frequency details for enhancing video quality. Experimental results demonstrate the superiority of the proposed method in enhancing the perceptual quality of compressed video.
The great success of deep learning has boosted the fast development of video quality enhancement. However, existing methods mainly focus on enhancing the objective quality of compressed video, and ignore their perceptual quality that plays a key role in determining quality of experience (QoE) of videos. In this paper, we aim at enhancing the perceptual quality of compressed video. Our main observation is that perceptual quality enhancement mostly relies on recovering the high-frequency details with fine textures. Accordingly, we propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT), which is called multi-level wavelet-based GAN+ (MW-GAN+), to exploit high-frequency details for enhancing the perceptual quality of compressed video. In MW-GAN+, we first propose a multi-level wavelet pixel-adaptive (MWP) module to extract temporal information across video frames, such that frame similarity can be utilized in recovering high-frequency details. Then, a wavelet reconstruction network, consisting of wavelet-dense residual blocks (WDRB), is developed to recover high-frequency details in a multi-level manner for enhanced frame reconstruction. Finally, we develop a 3D discriminator to encourage temporal coherence with a 3D-CNN based architecture. Experimental results demonstrate the superiority of our method over state-of-the-art methods in enhancing the perceptual quality of compressed video. Our code is available at https://github.com/IceClear/MW-GAN.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据