4.7 Article

Rethinking Image Deblurring via CNN-Transformer Multiscale Hybrid Architecture

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

A Survey on Vision Transformer

Kai Han et al.

Summary: Transformer, a deep neural network with a self-attention mechanism, has been initially used in natural language processing and is now gaining attention in computer vision tasks. Transformer-based models perform as well as or even better than convolutional and recurrent neural networks in various visual benchmarks. This paper reviews vision transformer models, categorizes them based on different tasks, and analyzes their advantages and disadvantages. The discussed categories include backbone network, high/mid-level vision, low-level vision, and video processing. Efficient methods for applying transformer in real device-based applications are also explored. The challenges and further research directions for vision transformers are discussed as well.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

CDMC-Net: Context-Aware Image Deblurring Using a Multi-scale Cascaded Network

Qian Zhao et al.

Summary: In this paper, a novel context-aware multi-scale convolutional neural network (CDMC-Net) is proposed for image deblurring. The method progressively restores latent sharp images in two stages and introduces a cross-stage feature aggregation strategy to enhance information flow interaction. The key design of CDMC-Net is the use of a multi-input multi-output encoder-decoder at each stage, which reduces computational complexity. Additionally, a multi-strip feature extraction module is proposed to effectively capture long-range context information in different scenarios.

NEURAL PROCESSING LETTERS (2023)

Article Computer Science, Software Engineering

A two-stage network with wavelet transformation for single-image deraining

Hao Yang et al.

Summary: Image deraining is an important and challenging computer vision task. This study proposes a two-stage method to effectively remove rain streaks and reconstruct high-quality images. The method utilizes a structure-preserving network and a feature extraction module to enhance the details. Extensive experiments show excellent performance on synthetic and real-world datasets, and the method proves to be effective in downstream vision tasks.

VISUAL COMPUTER (2023)

Article Computer Science, Artificial Intelligence

Multi-Scale Hybrid Fusion Network for Single Image Deraining

Kui Jiang et al.

Summary: This study focuses on addressing the problem of generating rain-free images under complex rain conditions using deep learning models. By designing a multi-level pyramid structure, non-local fusion module, attention fusion module, and residual learning branch to handle different challenges, the results demonstrate that our method achieves superior performance in generating rain-free images.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Defocus Image Deblurring Network With Defocus Map Estimation as Auxiliary Task

Haoyu Ma et al.

Summary: This paper proposes a network architecture called DID-ANet for single image defocus deblurring by using defocus map estimation as an auxiliary task. A large-scale dataset is also built for network training.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Simple Baselines for Image Restoration

Liangyu Chen et al.

Summary: This paper proposes a simple and computationally efficient baseline method that outperforms state-of-the-art methods in image restoration. By eliminating the need for nonlinear activation functions, the proposed method achieves better results with lower computational costs. The method achieves state-of-the-art results on challenging benchmarks.

COMPUTER VISION, ECCV 2022, PT VII (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Uformer: A General U-Shaped Transformer for Image Restoration

Zhendong Wang et al.

Summary: This paper introduces Uformer, an image restoration architecture based on Transformer, with a hierarchical encoder-decoder network and novel designs including locally-enhanced window Transformer block and learnable multi-scale restoration modulator. Uformer demonstrates high capability for image restoration tasks and achieves superior performance in various experiments.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Restormer: Efficient Transformer for High-Resolution Image Restoration

Syed Waqas Zamir et al.

Summary: Convolutional neural networks (CNNs) perform well at learning image priors, while Transformers capture long-range pixel interactions. However, the computational complexity of Transformers makes it challenging to apply them to high-resolution image restoration tasks. This work proposes an efficient Transformer model, Restormer, which achieves state-of-the-art results on various image restoration tasks.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Article Engineering, Electrical & Electronic

HFMNet: Hierarchical Feature Mining Network for Low-Light Image Enhancement

Kai Xu et al.

Summary: This study addresses the issues of illumination and edge features in low-light images by proposing a hierarchical feature mining network that analyzes frequency distributions to extract crucial information, achieving state-of-the-art performance in terms of image quality through extensive experiments and analysis.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2022)

Article Engineering, Electrical & Electronic

Toward Real-World Super-Resolution Technique for Fringe Projection Profilometry

Pengcheng Yao et al.

Summary: This article presents a real-world 2-D-to-3-D super-resolution technique for obtaining high-resolution 3-D shape from 2-D fringe images in fringe projection profilometry. The technique utilizes pixel-to-pixel mapping to align the images and generate a more accurate dataset.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2022)

Article Computer Science, Artificial Intelligence

Image Quality Assessment: Unifying Structure and Texture Similarity

Keyan Ding et al.

Summary: This paper presents a full-reference image quality model with explicit tolerance to texture resampling. By using a convolutional neural network, the authors construct an injective and differentiable function to transform images. The proposed method combines texture similarity and structure similarity to match human ratings of image quality and achieves competitive performance on texture classification and retrieval tasks.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Engineering, Electrical & Electronic

CT-Net: An Efficient Network for Low-Altitude Object Detection Based on Convolution and Transformer

Tao Ye et al.

Summary: In this article, a deep learning method called CT-Net is proposed for low-altitude small-object detection. It addresses the limitations of existing detection methods in accuracy, model size, and speed through the introduction of an attention-enhanced transformer block, a lightweight bottleneck module, and a directional feature fusion structure. Experimental results show that CT-Net outperforms other detectors on low-altitude small-object datasets and MS COCO.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2022)

Article Engineering, Electrical & Electronic

DU-GAN: Generative Adversarial Networks With Dual-Domain U-Net-Based Discriminators for Low-Dose CT Denoising

Zhizhong Huang et al.

Summary: This article introduces a novel method called DU-GAN, which utilizes U-Net-based discriminators in the GAN framework to learn both global and local differences between denoised and normal-dose LDCT images in both image and gradient domains. By applying two different discriminators in the image and gradient domains, and using the CutMix technique to provide a confidence map, this method achieves superior results in terms of image quality and diagnostic performance for LDCT.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2022)

Article Computer Science, Artificial Intelligence

Transformer for 3D Point Clouds

Jiayun Wang et al.

Summary: This study introduces a novel end-to-end approach to learn different non-rigid transformations of input point clouds for optimal local neighborhoods at each layer, achieving better feature extraction for 3D point clouds.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Computer Science, Artificial Intelligence

I(2)Transformer: Intra- and Inter-Relation Embedding Transformer for TV Show Captioning

Yunbin Tu et al.

Summary: TV show captioning plays an important role in generating linguistic sentences based on video and subtitles, with challenges such as scattered information and semantic gap. The proposed I(2)Transformer model achieves state-of-the-art performance by capturing intra- and inter-relations, and shows good generalization ability in other relevant tasks.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2022)

Article Computer Science, Artificial Intelligence

AutoML: A survey of the state-of-the-art

Xin He et al.

Summary: Deep learning techniques have achieved remarkable results in various tasks, but building a high-quality DL system requires human expertise. Automated machine learning is a promising solution that is currently being extensively researched.

KNOWLEDGE-BASED SYSTEMS (2021)

Article Computer Science, Artificial Intelligence

Deep Learning for Image Super-Resolution: A Survey

Zhihao Wang et al.

Summary: This article provides a comprehensive survey on recent advances of image super-resolution using deep learning approaches, categorizing existing studies into supervised, unsupervised, and domain-specific SR techniques, as well as covering benchmark datasets and evaluation metrics. Future directions and open issues in the field are also highlighted for further research.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2021)

Article Engineering, Electrical & Electronic

Deep Convolutional-Neural-Network-Based Channel Attention for Single Image Dynamic Scene Blind Deblurring

Shengdao Wan et al.

Summary: The paper proposes a novel multi-scale channel attention network (MSCAN) for effective single image dynamic scene blind deblurring, combining a spatial pyramid pooling channel attention strategy for more powerful network representation. Extensive experiments show that the method outperforms state-of-the-art SIDSBD methods in both qualitative evaluation and quantitative metrics.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Pyramid Architecture Search for Real-Time Image Deblurring

Xiaobin Hu et al.

Summary: The study introduces a deblurring method called PyNAS that automatically designs hyper-parameters, utilizing gradient-based and hierarchical search strategies to achieve real-time deblurring algorithm and state-of-the-art performance on GoPro and Video Deblurring datasets.

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) (2021)

Article Engineering, Electrical & Electronic

A Lightweight Mimic Convolutional Auto-Encoder for Denoising Retinal Optical Coherence Tomography Images

Mahnoosh Tajmirriahi et al.

Summary: The study implemented a lightweight convolutional AE network for mimicking the latest OCT image denoising method. The performance of the network was evaluated on various test data sets using visual inspection and quantitative metrics, confirming its good performance in removing speckle noise in OCT scans. The proposed network demonstrated generality, computational efficiency, and device independence, making it suitable for real-time, mobile applications.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2021)

Article Engineering, Electrical & Electronic

ViT-P: Classification of Genitourinary Syndrome of Menopause From OCT Images Based on Vision Transformer Models

Haoran Wang et al.

Summary: This study introduces the vision transformer (ViT) to medical OCT images for the first time and proposes a deep learning-based approach for GSM lesion screening. By building a GSM dataset and experimental model, it aims to address practical issues and improve classification accuracy in OCT images, reducing the workload of gynecologists.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2021)

Article Computer Science, Artificial Intelligence

Rain-Free and Residue Hand-in-Hand: A Progressive Coupled Network for Real-Time Image Deraining

Kui Jiang et al.

Summary: The paper introduces a progressive coupled network named PCNet, aiming to effectively separate raindrops and preserve rain-free details in images. By studying blending correlations and designing a novel coupled representation module, PCNet can efficiently remove rain streaks from images while keeping details. Experiments demonstrate that PCNet performs well in image deraining and joint vision tasks.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Article Acoustics

CTNet: Conversational Transformer Network for Emotion Recognition

Zheng Lian et al.

Summary: The study proposes a multimodal learning framework for conversational emotion recognition, named conversational transformer network (CTNet). By modeling intra-modal and cross-modal interactions, capturing temporal information using lexical and acoustic features, and utilizing a bi-directional GRU component and speaker embeddings to model context-sensitive and speaker-sensitive dependencies, the experimental results demonstrate the effectiveness of the method. The approach shows a performance improvement of 2.1% to 6.2% on weighted average F1 over state-of-the-art strategies.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2021)

Article Computer Science, Information Systems

Hard Pixel Mining for Depth Privileged Semantic Segmentation

Zhangxuan Gu et al.

Summary: This paper proposes a novel method for mining depth information for semantic segmentation, using the depth of training images to learn a more robust model and achieve hard pixels mining on multi-scales. The method achieves state-of-the-art results on three benchmark datasets.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Computer Science, Artificial Intelligence

Object Detection in Videos by High Quality Object Linking

Peng Tang et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2020)

Article Engineering, Electrical & Electronic

Online Monitoring of Flotation Froth Bubble-Size Distributions via Multiscale Deblurring and Multistage Jumping Feature-Fused Full Convolutional Networks

Jinping Liu et al.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2020)

Article Computer Science, Artificial Intelligence

Dark and Bright Channel Prior Embedded Network for Dynamic Scene Deblurring

Jianrui Cai et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2020)

Article Computer Science, Artificial Intelligence

Graph-Based Blind Image Deblurring From a Single Photograph

Yuanchao Bai et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)

Article Computer Science, Artificial Intelligence

Deblurring Images via Dark Channel Prior

Jinshan Pan et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2018)

Article Computer Science, Artificial Intelligence

Deblurring Low-Light Images with Light Streaks

Zhe Hu et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2018)

Article Engineering, Electrical & Electronic

A Displacement Uncertainty Model for 2-D DIC Measurement Under Motion Blur Conditions

Alberto Lavatelli et al.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution

Wei-Sheng Lai et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

From Motion Blur to Motion Flow: a Deep Learning Solution for Removing Heterogeneous Motion Blur

Dong Gong et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Article Computer Science, Artificial Intelligence

Image information and visual quality

HR Sheikh et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2006)

Article Engineering, Electrical & Electronic

An image enhancement technique combining sharpening and noise reduction

F Russo

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2002)