☆ 4.7 Article

Rethinking Image Deblurring via CNN-Transformer Multiscale Hybrid Architecture

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2023)

期刊

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

卷 72, 期 -, 页码 -

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIM.2022.3230482

关键词

Transformers; Image restoration; Convolutional neural networks; Convolution; Computer architecture; Kernel; Task analysis; Image deblurring; motion blur; multiscale strategy; neural networks; vision transformer (ViT)

类别

Engineering, Electrical & Electronic Instruments & Instrumentation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Image deblurring is a low-level vision task that aims to estimate sharp images from blurred images. Traditional CNN-based deblurring methods suffer from limitations in model performance and capturing long-range dependencies. To address these issues, we propose a hybrid architecture called CTMS, which combines CNN and transformer. CTMS effectively handles large-area blur, adapts to input content, and reduces computational burden.

Image deblurring is a representative low-level vision task that aims to estimate latent sharp images from blurred images. Recently, convolutional neural network (CNN)-based methods have dominated image deblurring. However, traditional CNN-based deblurring methods suffer from two essential issues: first, existing multiscale deblurring methods process blurred images at different scales through sub-networks with the same composition, which limits the model performance. Second, the convolutional layers fail to adapt to the input content and cannot effectively capture long-range dependencies. To alleviate the above issues, we rethink the multiscale architecture that follows a coarse-to-fine strategy and propose a novel hybrid architecture that combines CNN and transformer (CTMS). CTMS has three distinct features. First, the finer-scale sub-networks in CTMS are designed as architectures with larger receptive fields to obtain the pixel values around the blur, which can be used to efficiently handle large-area blur. Then, we propose a feature modulation network to alleviate the disadvantages of CNN sub-networks that lack input content adaptation. Finally, we design an efficient transformer block, which significantly reduces the computational burden and requires no pre-training. Our proposed deblurring model is extensively evaluated on several benchmark datasets, and achieves superior performance compared to state-of-the-art deblurring methods. Especially, the peak signal to noise ratio (PSNR) and structural similarity (SSIM) values are 32.73 dB and 0.959, respectively, on the popular dataset GoPro. In addition, we conduct joint evaluation experiments on the proposed method deblurring performance, object detection, and image segmentation to demonstrate the effectiveness of CTMS for subsequent high-level computer vision tasks.

Rethinking Image Deblurring via CNN-Transformer Multiscale Hybrid Architecture

期刊

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Rethinking Image Deblurring via CNN-Transformer Multiscale Hybrid Architecture

期刊

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文