4.6 Article

Multi-dimensional, multi-functional and multi-level attention in YOLO for underwater object detection

期刊

NEURAL COMPUTING & APPLICATIONS
卷 35, 期 27, 页码 19935-19960

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s00521-023-08781-w

关键词

Underwater object detection; YOLO detector; Attention perception; Attention calibration; Synergistic advantage

向作者/读者索取更多资源

Underwater object detection is essential for autonomous operation and ocean exploration of underwater robots. To address the challenges of poor imaging quality, harsh underwater environments, and concealed underwater targets, we propose a multi-dimensional, multi-functional, and multi-level attention module (mDFLAM). Our approach enhances the robustness, flexibility, and diversity of attention perception through strategies such as multi-dimensional information collection, capturing channel semantic information, and extracting intrinsic information under different receptive fields.
Underwater object detection is a prerequisite for underwater robots to achieve autonomous operation and ocean exploration. However, poor imaging quality, harsh underwater environments and concealed underwater targets greatly aggravate the difficulty of underwater object detection. In order to reduce underwater background interference and improve underwater object perception, we propose a multi-dimensional, multi-functional and multi-level attention module (mDFLAM). The multi-dimensional strategy first enhances the robustness of attention application by collecting valuable information in different target dimensions. The multi-functional strategy further improves the flexibility of attention calibration by capturing the importance of channel semantic information and the dependence of spatial location information. The multi-level strategy finally enriches the diversity of attention perception by extracting the intrinsic information under different receptive fields. In pre-processing and post-processing stages, cross-splitting and cross-linking stimulate the synergistic calibration advantage of multi-dimensional and multi-functional attention by redistributing channel dimensions and restoring feature states. In the attention calibration stage, adaptive fusion stimulates the synergistic calibration advantage of multi-level attention by assigning learnable parameters. In order to meet the high-precision and real-time requirements for underwater object detection, we integrate the plug-and-play mDFLAM into YOLO detectors. The full-port embedding further strengthens the semantic information expression by improving the feature fusion quality between scales. In underwater detection tasks, ablation and comparison experiments demonstrate the rationality and effectiveness of our attention design. In other detection tasks, our work shows good robustness and generalization.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据