☆ 4.6 Article

Double-Attention YOLO: Vision Transformer Model Based on Image Processing Technology in Complex Environment of Transmission Line Connection Fittings and Rust Detection

MACHINES (2022)

Journal

MACHINES

Volume 10, Issue 11, Pages -

Publisher

MDPI

DOI: 10.3390/machines10111002

Keywords

transmission line connection fittings; multi-scale target detection; Vision Transformer; image defogging technology; attention mechanism; model compression and optimization

Funding

Natural Science Basis Research Plan in Shaanxi Province of China [2022JQ-568]
Scientific Research Program - Shaanxi Provincial Education Department [21JK0661]
Key Research and Development Projects in Shaanxi Province [2021GY-306]
Key R&D plan of Shannxi [2021GY-320, 2020ZDLGY09-10]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes an image processing method based on an improved dark channel defogging algorithm, fusion channel spatial attention mechanism, Vision Transformer, and GhostNet model compression method for small target detection in complex environments. The experimental results show that this method can improve the detection performance of the model.

Transmission line fittings have been exposed to complex environments for a long time. Due to the interference of haze and other environmental factors, it is often difficult for the camera to obtain high quality on-site images, and the traditional image processing technology and convolution neural networks find it difficult to effectively deal with the dense detection task of small targets with occlusion interference. Therefore, an image processing method based on an improved dark channel defogging algorithm, the fusion channel spatial attention mechanism, Vision Transformer, and the GhostNet model compression method is proposed in this paper. Based on the global receptive field of the saliency region capture and enhancement model, a small target detection network Double-attention YOLO for complex environments is constructed. The experimental results show that embedding a multi-head self-attention component into a convolutional neural network can help the model to better interpret the multi-scale global semantic information of images. In this way, the model learns more easily the distinguishable features in the image representation. Embedding an attention mechanism module can make the neural network pay more attention to the salient region of image. Dual attention fusion can balance the global and local characteristics of the model, to improve the performance of model detection.

Double-Attention YOLO: Vision Transformer Model Based on Image Processing Technology in Complex Environment of Transmission Line Connection Fittings and Rust Detection

Journal

MACHINES

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Double-Attention YOLO: Vision Transformer Model Based on Image Processing Technology in Complex Environment of Transmission Line Connection Fittings and Rust Detection

Journal

MACHINES

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper