4.7 Article

FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images

期刊

REMOTE SENSING
卷 14, 期 16, 页码 -

出版社

MDPI
DOI: 10.3390/rs14163919

关键词

object detection; Unmanned Aerial Vehicles; deep learning

资金

  1. Fundamental Research Funds for the CentralUniversities [20101216855]
  2. Key R&D Projects of Qingdao Science and Technology Plan [21-1-2-18-xx]

向作者/读者索取更多资源

In this paper, we propose a Fine-grained Target Focusing Network (FiFoNet) that effectively selects multi-scale features, blocks background interference, and enhances the representation of small objects. Furthermore, a Global-Local Context Collector (GLCC) is introduced to extract global and local contextual information for improving low-quality representations. Experimental results demonstrate the superior performance of FiFoNet in object detection for UAV images.
Detecting objects from images captured by Unmanned Aerial Vehicles (UAVs) is a highly demanding task. It is also considered a very challenging task due to the typically cluttered background and diverse dimensions of the foreground targets, especially small object areas that contain only very limited information. Multi-scale representation learning presents a remarkable approach to recognizing small objects. However, this strategy ignores the combination of the sub-parts in an object and also suffers from the background interference in the feature fusion process. To this end, we propose a Fine-grained Target Focusing Network (FiFoNet) which can effectively select a combination of multi-scale features for an object and block background interference, which further revitalizes the differentiability of the multi-scale feature representation. Furthermore, we propose a Global-Local Context Collector (GLCC) to extract global and local contextual information and enhance low-quality representations of small objects. We evaluate the performance of the proposed FiFoNet on the challenging task of object detection in UAV images. A comparison of the experiment results on three datasets, namely VisDrone2019, UAVDT, and our VisDrone_Foggy, demonstrates the effectiveness of FiFoNet, which outperforms the ten baseline and state-of-the-art models with remarkable performance improvements. When deployed on an edge device NVIDIA JETSON XAVIER NX, our FiFoNet only takes about 80 milliseconds to process an drone-captured image.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据