☆ 4.6 Article

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

NEURAL COMPUTING & APPLICATIONS (2023)

期刊

NEURAL COMPUTING & APPLICATIONS

卷 35, 期 27, 页码 19935-19960

出版社

SPRINGER LONDON LTD

DOI: 10.1007/s00521-023-08781-w

关键词

Underwater object detection; YOLO detector; Attention perception; Attention calibration; Synergistic advantage

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Underwater object detection is essential for autonomous operation and ocean exploration of underwater robots. To address the challenges of poor imaging quality, harsh underwater environments, and concealed underwater targets, we propose a multi-dimensional, multi-functional, and multi-level attention module (mDFLAM). Our approach enhances the robustness, flexibility, and diversity of attention perception through strategies such as multi-dimensional information collection, capturing channel semantic information, and extracting intrinsic information under different receptive fields.

Underwater object detection is a prerequisite for underwater robots to achieve autonomous operation and ocean exploration. However, poor imaging quality, harsh underwater environments and concealed underwater targets greatly aggravate the difficulty of underwater object detection. In order to reduce underwater background interference and improve underwater object perception, we propose a multi-dimensional, multi-functional and multi-level attention module (mDFLAM). The multi-dimensional strategy first enhances the robustness of attention application by collecting valuable information in different target dimensions. The multi-functional strategy further improves the flexibility of attention calibration by capturing the importance of channel semantic information and the dependence of spatial location information. The multi-level strategy finally enriches the diversity of attention perception by extracting the intrinsic information under different receptive fields. In pre-processing and post-processing stages, cross-splitting and cross-linking stimulate the synergistic calibration advantage of multi-dimensional and multi-functional attention by redistributing channel dimensions and restoring feature states. In the attention calibration stage, adaptive fusion stimulates the synergistic calibration advantage of multi-level attention by assigning learnable parameters. In order to meet the high-precision and real-time requirements for underwater object detection, we integrate the plug-and-play mDFLAM into YOLO detectors. The full-port embedding further strengthens the semantic information expression by improving the feature fusion quality between scales. In underwater detection tasks, ablation and comparison experiments demonstrate the rationality and effectiveness of our attention design. In other detection tasks, our work shows good robustness and generalization.

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

期刊

NEURAL COMPUTING & APPLICATIONS

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

期刊

NEURAL COMPUTING & APPLICATIONS

出版社

SPRINGER LONDON LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文