☆ 4.6 Article

Stereoscopic video quality measurement with fine-tuning 3D ResNets

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Volume 81, Issue 29, Pages 42849-42869

Publisher

SPRINGER

DOI: 10.1007/s11042-022-13485-9

Keywords

3D convolutional neural networks; Fine-tuning; Objective quality assessment; Pre-training; Stereoscopic video; Transfer learning

Funding

Scientific and Technological Research Council of Turkey (TUBITAK) [118C301]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Recently, 3D Convolutional Neural Networks (3D CNNs) have shown superior performance over 2D CNNs in video processing applications. In the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatio-temporal features from stereoscopic videos. Pre-trained 3D Residual Networks (3D ResNets) on the Kinetics dataset are fine-tuned to measure the quality of stereoscopic videos and propose a no-reference SVQA method. Experimental results on publicly available SVQA datasets demonstrate the effectiveness of the proposed transfer learning-based method.

Recently, Convolutional Neural Networks with 3D kernels (3D CNNs) have shown great superiority over 2D CNNs for video processing applications. In the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are utilized to extract the spatio-temporal features from the stereoscopic video. Besides, the emergence of substantial video datasets such as Kinetics has made it possible to use pre-trained 3D CNNs in other video-related fields. In this paper, we fine-tune 3D Residual Networks (3D ResNets) pre-trained on the Kinetics dataset for measuring the quality of stereoscopic videos and propose a no-reference SVQA method. Specifically, our aim is twofold: Firstly, we answer the question: can we use 3D CNNs as a quality-aware feature extractor from stereoscopic videos or not. Secondly, we explore which ResNet architecture is more appropriate for SVQA. Experimental results on two publicly available SVQA datasets of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 show the effectiveness of the proposed transfer learning-based method for SVQA that provides the RMSE of 0.332 in LFOVIAS3DPh2 dataset. Also, the results show that deeper 3D ResNet models extract more efficient quality-aware features.

Stereoscopic video quality measurement with fine-tuning 3D ResNets

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Stereoscopic video quality measurement with fine-tuning 3D ResNets

Journal

MULTIMEDIA TOOLS AND APPLICATIONS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper