☆ 4.7 Article

SwinWave-SR: Multi-scale lightweight underwater image super-resolution

INFORMATION FUSION (2024)

Journal

INFORMATION FUSION

Volume 103, Issue -, Pages -

Publisher

ELSEVIER

DOI: 10.1016/j.inffus.2023.102127

Keywords

Vision transformer (ViT); Wavelet transform; Super-resolution; Underwater Image Enhancement (UIE); Lightweight model

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The resource-limited nature of underwater vision equipment affects underwater robotics and ocean engineering tasks. Super-resolution methods, particularly using Vision Transformers (ViTs), have emerged to enhance low-resolution underwater images. However, ViTs face challenges in handling severe degradation in underwater imaging. In contrast, Multi-scale ViTs (MViTs) overcome these challenges by preserving long-range dependencies through evolving channel capacity. This study proposes a novel algorithm, SwinWave-SR, for efficient and accurate multi-scale super-resolution for underwater images.

The resource-limited nature of underwater vision equipment leads to poor, otherwise low-resolution information affecting the downstream underwater robotics and ocean engineering tasks. Underwater Image Enhancement (UIE) methods have emerged, particularly Super-Resolution (SR), to tackle the aforementioned challenge by restoring the corresponding low-resolution image to a high-quality counterpart. Vision Transformers (ViTs) have recently been employed for SR tasks thanks to their superior performance over mainstream convolution neural networks. The success of ViTs is largely due to their self-attention mechanism; however, they may encounter challenges in dealing with severe and unpredictable degradation in underwater imaging. In contrast, Multi-scale ViTs (MViTs) variants such as the Swin transformers have overcome that challenge by preserving long-range dependencies over multi-scale feature hierarchies through evolving channel capacity. MViTs tend to induce spatial efficiency through classical down-sampling, such as average pooling over key/values, which results in an inevitable loss of high-frequency components. To address this lack, in the current work, we propose a novel algorithm, SwinWave-SR, for efficient and accurate multi-scale SR for underwater images. Our proposed algorithm is based on Swin transformer consisting of a wavelet block to restrict the information drop by downsampling in an invertible fashion. Consequently, the key components are preserved to assist self-attention learning while reducing its computational cost simultaneously. To further complement it, we explore a prominent compression regime, namely the Lottery Ticket Hypothesis (LTH), to discover a lightweight sub-network with competitive performance to its original model by reducing computational costs up to 70.44%. Overall, SwinWave-SR improves peak signal-to-noise ratio (PSNR) by 0.95 dB similar to 2.23 dB compared to the state-of-the-art SwinIR while reducing the number of parameters by 29.56% and the calculation cost by 18.734%. Experimental results show that the proposed SwinWave-SR method outperforms the state-of-the-art SR methods on four benchmark underwater datasets and significantly improves PSNR and structural similarity index (SSIM).

SwinWave-SR: Multi-scale lightweight underwater image super-resolution

Journal

INFORMATION FUSION

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

SwinWave-SR: Multi-scale lightweight underwater image super-resolution

Journal

INFORMATION FUSION

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper