4.7 Article

Learning Orientation-Aware Distances for Oriented Object Detection

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TGRS.2023.3278933

Keywords

Detectors; Object detection; Fourier series; Feature extraction; Training; Predictive models; Transformers; Fourier series transformation (FST); orientation-aware distance; oriented object detection; remote sensing images

Ask authors/readers for more resources

In this work, the authors propose a novel approach to solve the discontinuous boundary problem in oriented object detection. They introduce a contour function that relates orientations to distance predictions, and represent this function as a linear combination of trigonometric functions and Fourier series. By replacing the final 4-D layer in the regression branch with a Fourier series transformation module, their new network FCOSF achieves superior performance compared to other one-stage oriented object detectors.
Oriented object detectors have suffered severely from the discontinuous boundary problem for a long time. In this work, we ingeniously avoid this problem by relating regression outputs to regression target orientations. The core idea of our method is to build a contour function which imports orientations and outputs the corresponding distance predictions. Inspired by Fourier transformations, we assume this function can be represented as a linear combination of trigonometric functions and Fourier series. We replace the final 4-D layer in the regression branch of fully convolutional one-stage object detector (FCOS) with a Fourier series transformation (FST) module and term this new network FCOSF. By this unique design, the regression outputs in FCOSF can adaptively vary according to the regression target orientations. Thus, the discontinuous boundary has no impact on our FCOSF. More importantly, FCOSF avoids building complicated oriented box representations, which usually cause extra computations and ambiguities. With only flipping augmentation and single-scale training and testing, FCOSF with ResNet-50 achieves 73.64% mean average precision (mAP) on the DOTA-v1.0 dataset with up to 23.6-frames/s speed, surpassing all one-stage oriented object detectors. On the more challenging DOTA-v2.0 dataset, FCOSF also achieves the highest results of 51.75% mAP among one-stage detectors. More experiments on DIOR-R and HRSC2016 are also conducted to verify the robustness of FCOSF. Code and models will be available at https://github.com/DDGRCF/FCOSF.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available