4.6 Article

Manhattan-distance IOU loss for fast and accurate bounding box regression and object detection

Journal

NEUROCOMPUTING
Volume 500, Issue -, Pages 99-114

Publisher

ELSEVIER
DOI: 10.1016/j.neucom.2022.05.052

Keywords

Manhattan-distance IOU loss; Bounding box regression; Object detection; Convolutional neural network

Funding

  1. National Key Research and Development Program of China [2020YFA0608501]
  2. Shen-zhen Science and Technology Innovation Project [JSGG20191129145212206]
  3. National Natural Science Foundation of China [42071351]
  4. Western Talents [2018XBYJRC004]

Ask authors/readers for more resources

This study proposes a MIOU loss function to improve the performance of bounding box regression. By considering the Manhattan distance and normalization coefficient, it addresses the shortcomings of existing loss functions. Experimental results in remote sensing and natural object detection scenarios demonstrate the excellent performance of the MIOU loss.
Bounding box regression is a crucial step in most object detection algorithms, and directly affects the positioning accuracy and regression speed of convolutional neural networks (CNN). The existing loss functions commonly used in bounding box regression suffer two main disadvantages: firstly, the ln- norm loss does not match the evaluation metric Intersection over Union (IOU), leading to poor regression performance. Second, some recently proposed IOU-based loss functions are beneficial to IOU metric, but the negative effects of some terms in these loss functions on bounding box regression lead to slow convergence and inaccurate regression results. To solve these shortcomings, we proposed a Manhattan Distance IOU (MIOU) loss function here. It takes into account that the Euclidean distance term in the Complete IOU (CIOU) loss and the Efficient IOU (EIOU) loss is unstable in training due to the huge gradient in the early stage of regression, and the Manhattan distance is added to effectively alleviate this defect. In addition, the denominator of the Euclidean distance term in the two loss functions discussed above has an antagonistic effect on loss reduction, and setting it as a normalized coefficient without participating in backpropagation can effectively improve the convergence speed. The effectiveness of the proposed MIOU loss was verified with designed simulation experiments. Moreover, object detection is usually applied to natural scenes and remote sensing scenes, but the application of detection methods are often limited due to varied image characteristics in different scene settings. We incorporated the MIOU loss into YOLO v4 and other mainstream object detection networks to examine its effectiveness in remote sensing and natural object detection scenarios. The experimental results on real remote sensing datasets DOTA and natural datasets MS COCO demonstrate that the MIOU loss has strong robustness in both remote sensing object detection tasks and natural object detection tasks. In summary, as a general regression loss function, the MIOU loss shows excellent performance in the above two types of scenes.(C) 2022 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available