☆ 4.7 Article

Objective Video Quality Assessment Combining Transfer Learning With CNN

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2020)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 31, 期 8, 页码 2716-2730

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2018.2890310

关键词

Measurement; Feature extraction; Spatiotemporal phenomena; Video recording; Quality assessment; Databases; Nonlinear distortion; Convolutional neural network (CNN); imbalanced data; transfer learning; video quality assessment (VQA)

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

National Key Research and Development Program of China [2016QY01W0200]
National Natural Science Foundation of China [61432014, 61772402, 61876146, U1605252, 61671339]
National High-Level Talents Special Support Program of China [CS31117200001]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Nowadays, video quality assessment (VQA) is essential to video compression technology applied to video transmission and storage. However, small-scale video quality databases with imbalanced samples and low-level feature representations for distorted videos impede the development of VQA methods. In this paper, we propose a full-reference (FR) VQA metric integrating transfer learning with a convolutional neural network (CNN). First, we imitate the feature-based transfer learning framework to transfer the distorted images as the related domain, which enriches the distorted samples. Second, to extract high-level spatiotemporal features of the distorted videos, a six-layer CNN with the acknowledged learning ability is pretrained and finetuned by the common features of the distorted image blocks (IBs) and video blocks (VBs), respectively. Notably, the labels of the distorted IBs and VBs are predicted by the classic FR metrics. Finally, based on saliency maps and the entropy function, we conduct a pooling stage to obtain the quality scores of the distorted videos by weighting the block-level scores predicted by the trained CNN. In particular, we introduce a preprocessing and a postprocessing to reduce the impact of inaccurate labels predicted by the FR-VQA metric. Due to feature learning in the proposed framework, two kinds of experimental schemes including train-test iterative procedures on one database and tests on one database with training other databases are carried out. The experimental results demonstrate that the proposed method has high expansibility and is on a par with some state-of-the-art VQA metrics on two widely used VQA databases with various compression distortions.

Objective Video Quality Assessment Combining Transfer Learning With CNN

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Objective Video Quality Assessment Combining Transfer Learning With CNN

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文