☆ 4.6 Article

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

NEURAL COMPUTING & APPLICATIONS (2023)

Journal

NEURAL COMPUTING & APPLICATIONS

Volume 35, Issue 27, Pages 19935-19960

Publisher

SPRINGER LONDON LTD

DOI: 10.1007/s00521-023-08781-w

Keywords

Underwater object detection; YOLO detector; Attention perception; Attention calibration; Synergistic advantage

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Underwater object detection is essential for autonomous operation and ocean exploration of underwater robots. To address the challenges of poor imaging quality, harsh underwater environments, and concealed underwater targets, we propose a multi-dimensional, multi-functional, and multi-level attention module (mDFLAM). Our approach enhances the robustness, flexibility, and diversity of attention perception through strategies such as multi-dimensional information collection, capturing channel semantic information, and extracting intrinsic information under different receptive fields.

Underwater object detection is a prerequisite for underwater robots to achieve autonomous operation and ocean exploration. However, poor imaging quality, harsh underwater environments and concealed underwater targets greatly aggravate the difficulty of underwater object detection. In order to reduce underwater background interference and improve underwater object perception, we propose a multi-dimensional, multi-functional and multi-level attention module (mDFLAM). The multi-dimensional strategy first enhances the robustness of attention application by collecting valuable information in different target dimensions. The multi-functional strategy further improves the flexibility of attention calibration by capturing the importance of channel semantic information and the dependence of spatial location information. The multi-level strategy finally enriches the diversity of attention perception by extracting the intrinsic information under different receptive fields. In pre-processing and post-processing stages, cross-splitting and cross-linking stimulate the synergistic calibration advantage of multi-dimensional and multi-functional attention by redistributing channel dimensions and restoring feature states. In the attention calibration stage, adaptive fusion stimulates the synergistic calibration advantage of multi-level attention by assigning learnable parameters. In order to meet the high-precision and real-time requirements for underwater object detection, we integrate the plug-and-play mDFLAM into YOLO detectors. The full-port embedding further strengthens the semantic information expression by improving the feature fusion quality between scales. In underwater detection tasks, ablation and comparison experiments demonstrate the rationality and effectiveness of our attention design. In other detection tasks, our work shows good robustness and generalization.

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

Journal

NEURAL COMPUTING & APPLICATIONS

Publisher

SPRINGER LONDON LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

Journal

NEURAL COMPUTING & APPLICATIONS

Publisher

SPRINGER LONDON LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper