4.7 Article

An Adaptive Attention Fusion Mechanism Convolutional Network for Object Detection in Remote Sensing Images

期刊

REMOTE SENSING
卷 14, 期 3, 页码 -

出版社

MDPI
DOI: 10.3390/rs14030516

关键词

image stitching; object detection; feature fusion; loss function

资金

  1. National Natural Science Foundation of China [41971281, 41961053]
  2. Sichuan Science and Technology Program [2020JDTD0003]

向作者/读者索取更多资源

For remote sensing object detection, the existing convolutional neural networks face the challenge of automatically fusing optimal feature information and overcoming sensitivity to adapt to multi-scale objects. In this study, a convolutional network model with an adaptive attention fusion mechanism is proposed, which addresses these challenges by using a stitcher to handle multi-scale objects and introducing a spatial attention model. Experimental results demonstrate that the proposed model improves accuracy and has stronger robustness compared to other state-of-the-art detectors.
For remote sensing object detection, fusing the optimal feature information automatically and overcoming the sensitivity to adapt multi-scale objects remains a significant challenge for the existing convolutional neural networks. Given this, we develop a convolutional network model with an adaptive attention fusion mechanism (AAFM). The model is proposed based on the backbone network of EfficientDet. Firstly, according to the characteristics of object distribution in datasets, the stitcher is applied to make one image containing objects of various scales. Such a process can effectively balance the proportion of multi-scale objects and handle the scale-variable properties. In addition, inspired by channel attention, a spatial attention model is also introduced in the construction of the adaptive attention fusion mechanism. In this mechanism, the semantic information of the different feature maps is obtained via convolution and different pooling operations. Then, the parallel spatial and channel attention are fused in the optimal proportions by the fusion factors to get the further representative feature information. Finally, the Complete Intersection over Union (CIoU) loss is used to make the bounding box better cover the ground truth. The experimental results of the optical image dataset DIOR demonstrate that, compared with state-of-the-art detectors such as the Single Shot multibox Detector (SSD), You Only Look Once (YOLO) v4, and EfficientDet, the proposed module improves accuracy and has stronger robustness.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据