3.8 Proceedings Paper

Bidirectional Matrix Feature Pyramid Network for Object Detection

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/ICPR48806.2021.9412229

Keywords

-

Funding

  1. National Natural Science Foundation of China [61533012, 91748120, 52041502]

Ask authors/readers for more resources

In this paper, the Bidirectional Matrix Feature Pyramid Network (BMFPN) is proposed to address the challenges in object detection caused by lack of scale invariance and sparse information propagation. BMFPN consists of three modules: Diagonal Layer Generation Module (DLGM), Top-down Module (TDM) and Bottom-up Module (BUM), which together construct a feature pyramid with varying scales and aspect ratios. Through bidirectional and reticular information flow, BMFPN effectively fuses multi-level information to improve object detection accuracy, achieving significant performance improvements on the MS COCO dataset.
Feature pyramids are widely used to improve scale invariance for object detection. Most methods just map the objects to feature maps with relevant square receptive fields, but rarely pay attention to the aspect ratio variation, which is also an important property of object instances. It will lead to a poor match between rectangular objects and assigned features with square receptive fields, thus preventing from accurate recognition and location. Besides, the information propagation among feature layers is sparse, namely, each feature in the pyramid may mainly or only contain single-level information, which is not representative enough for classification and localization sub-tasks. In this paper, Bidirectional Matrix Feature Pyramid Network (BMFPN) is proposed to address these issues. It consists of three modules: Diagonal Layer Generation Module (DLGM), Top-down Module (TDM) and Bottom-up Module (BUM). First, multi-level features extracted by backbone are fed into DLGM to produce the base features. Then these base features are utilized to construct the final feature pyramid through TDM and BUM in series. The receptive fields of the designed feature layers in BMFPN have various scales and aspect ratios. Objects can be correctly assigned to appropriate and representative feature maps with relevant receptive fields depending on its scale and aspect ratio properties. Moreover, TDM and BUM form bidirectional and reticular information flow, which effectively fuses multi-level information in top-down and bottom-up manner respectively. To evaluate the effectiveness of our proposed architecture, an end-to-end anchor-free detector is designed and trained by integrating BMFPN into FCOS. And the center-ness branch in FCOS is modified with our Gaussian center-ness branch (GCB), which brings another slight improvement. Without bells and whistles, our method gains +33%, +2.4% and +2.6% AP on MS COCO dataset from baselines with ResNet-50, ResNet-101 and ResNeXt-101 backbones, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available