4.7 Article

M3DGAF: Monocular 3D Object Detection With Geometric Appearance Awareness and Feature Fusion

Journal

IEEE SENSORS JOURNAL
Volume 23, Issue 11, Pages 11232-11240

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSEN.2022.3189174

Keywords

Three-dimensional displays; Estimation; Object detection; Task analysis; Feature extraction; Sensors; Detectors; 3D object detection; autonomous driving; geometric appearance awareness; feature fusion

Ask authors/readers for more resources

The object detection task in autonomous driving scenario relies on complex visual sensor systems. An improvement has been made for efficient 3D object detection task using a monocular sensor with geometric constraints. A Geometric Appearance Awareness (GAA) module is proposed to improve orientation estimation, while a Sample-aware Feature Fusion (SFF) head is designed for 3D dimension regression. The proposed method achieves significant improvements in the 3D object detection task on the KITTI dataset.
The object detection task in autonomous driving scenario is usually completed by a complex visual sensor system, such as the LiDAR sensor, the stereo sensor, and the monocular sensor. Recent progress in autonomous driving leverages a monocular sensor to achieve a highly efficient 3D object detection task with geometric constraints. These detectors improve with the explicit geometry projection, which can build the bridge between the 2D image plane and the 3D world space. However, they tend to focus on optimizing depth estimation and ignore the equally important 3D properties of orientation and 3D dimension. In this work, we propose a Geometric Appearance Awareness (GAA) module to improve the estimation of orientation. Specifically, a GAA module is proposed to obtain the geometry-guided appearance feature, which can be used to estimate reliable orientation. Furthermore, we design a Sample-aware Feature Fusion (SFF) head in the 3D dimension regression branch. This head dynamically deals with the uniqueness of different samples for learning 3D dimension. We evaluate our method on the KITTI dataset, and achieve significant improvements in the 3D object detection task. Compared with the latest method, our approach obtains a 1.08 improvement for the metric of AP(3D) on the hard level and 1.53/3.41/5.79 improvements for the metric of APBEV under the easy/moderate/hard settings, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available