4.7 Article

Stereo priori RCNN based car detection on point level for autonomous driving

期刊

KNOWLEDGE-BASED SYSTEMS
卷 229, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2021.107346

关键词

3D target detection; Semantic segmentation; Binocular vision; Autonomous driving

资金

  1. National Natural Science Foundation of China [61801323, 61972454]
  2. China Postdoctoral Science Foundation [2021M691848]
  3. Science and Technology Projects Fund of Suzhou, China [SS2019029]
  4. Natural Sci-ence Foundation of Jiangsu Province, China [BK20201405]
  5. Natural Science Foundation of Jiangsu Province, China [19KJB110021, 20KJB520018]

向作者/读者索取更多资源

The paper proposed a stereo priori RCNN based car detection method for autonomous driving, which combines traditional RPN with a Mask-branch mechanism to improve the accuracy of 3D target detection and utilize RGB images for semantic information. Extensive numerical experiments on Kitti dataset and nuScenes dataset demonstrate the efficiency and effectiveness of the proposed algorithm.
Binocular vision target detection algorithms generally require selection of a large number of keypoints, which result in a heavy computational effort for online calculation and the lack of ability to utilize spatial semantic information to full advantage. This paper proposed a stereo priori RCNN based car detection method on point level for autonomous driving. The algorithm combines traditional Region Proposal Network (RPN) with a Mask-branch mechanism, which improves the accuracy of 3D target detection through minimizing luminosity errors, and employs RGB images to provide semantic information for spatial point-clouds. Firstly, the proposed algorithm obtains bounding boxes of left and right images through RPN and classification networks. Then, the prior information of vehicles and branched convolutional neural networks are used to extract wheel features from a feature map. Therefore, the coordinates and orientation of vehicles on bird eye map can be fitted. Afterward, by predicting the keypoints, a 3D bounding box of each vehicle is roughly restored. Then Region of Interest (ROI) is applied on both sides of left and right cameras to minimize the photometric error, so that the precise position and the size of the detection frame are obtained. In the meantime, a Mask-branch mechanism is adopted to achieve a precise semantic segmentation of each ROI. Finally, a 3D bounding box of vehicles is used for point-cloud segmentation, and the semantic information provided by the Mask-branch is exploited to improve the segmentation accuracy. Extensive numerical experiments are conducted on the Kitti dataset and nuScenes dataset to demonstrate the efficiency and effectiveness of the proposed algorithm. (C) 2021 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据