4.7 Article

M3DGAF: Monocular 3D Object Detection With Geometric Appearance Awareness and Feature Fusion

期刊

IEEE SENSORS JOURNAL
卷 23, 期 11, 页码 11232-11240

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSEN.2022.3189174

关键词

Three-dimensional displays; Estimation; Object detection; Task analysis; Feature extraction; Sensors; Detectors; 3D object detection; autonomous driving; geometric appearance awareness; feature fusion

向作者/读者索取更多资源

The object detection task in autonomous driving scenario relies on complex visual sensor systems. An improvement has been made for efficient 3D object detection task using a monocular sensor with geometric constraints. A Geometric Appearance Awareness (GAA) module is proposed to improve orientation estimation, while a Sample-aware Feature Fusion (SFF) head is designed for 3D dimension regression. The proposed method achieves significant improvements in the 3D object detection task on the KITTI dataset.
The object detection task in autonomous driving scenario is usually completed by a complex visual sensor system, such as the LiDAR sensor, the stereo sensor, and the monocular sensor. Recent progress in autonomous driving leverages a monocular sensor to achieve a highly efficient 3D object detection task with geometric constraints. These detectors improve with the explicit geometry projection, which can build the bridge between the 2D image plane and the 3D world space. However, they tend to focus on optimizing depth estimation and ignore the equally important 3D properties of orientation and 3D dimension. In this work, we propose a Geometric Appearance Awareness (GAA) module to improve the estimation of orientation. Specifically, a GAA module is proposed to obtain the geometry-guided appearance feature, which can be used to estimate reliable orientation. Furthermore, we design a Sample-aware Feature Fusion (SFF) head in the 3D dimension regression branch. This head dynamically deals with the uniqueness of different samples for learning 3D dimension. We evaluate our method on the KITTI dataset, and achieve significant improvements in the 3D object detection task. Compared with the latest method, our approach obtains a 1.08 improvement for the metric of AP(3D) on the hard level and 1.53/3.41/5.79 improvements for the metric of APBEV under the easy/moderate/hard settings, respectively.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据