4.7 Article

Stereo priori RCNN based car detection on point level for autonomous driving

Journal

KNOWLEDGE-BASED SYSTEMS
Volume 229, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.knosys.2021.107346

Keywords

3D target detection; Semantic segmentation; Binocular vision; Autonomous driving

Funding

  1. National Natural Science Foundation of China [61801323, 61972454]
  2. China Postdoctoral Science Foundation [2021M691848]
  3. Science and Technology Projects Fund of Suzhou, China [SS2019029]
  4. Natural Sci-ence Foundation of Jiangsu Province, China [BK20201405]
  5. Natural Science Foundation of Jiangsu Province, China [19KJB110021, 20KJB520018]

Ask authors/readers for more resources

The paper proposed a stereo priori RCNN based car detection method for autonomous driving, which combines traditional RPN with a Mask-branch mechanism to improve the accuracy of 3D target detection and utilize RGB images for semantic information. Extensive numerical experiments on Kitti dataset and nuScenes dataset demonstrate the efficiency and effectiveness of the proposed algorithm.
Binocular vision target detection algorithms generally require selection of a large number of keypoints, which result in a heavy computational effort for online calculation and the lack of ability to utilize spatial semantic information to full advantage. This paper proposed a stereo priori RCNN based car detection method on point level for autonomous driving. The algorithm combines traditional Region Proposal Network (RPN) with a Mask-branch mechanism, which improves the accuracy of 3D target detection through minimizing luminosity errors, and employs RGB images to provide semantic information for spatial point-clouds. Firstly, the proposed algorithm obtains bounding boxes of left and right images through RPN and classification networks. Then, the prior information of vehicles and branched convolutional neural networks are used to extract wheel features from a feature map. Therefore, the coordinates and orientation of vehicles on bird eye map can be fitted. Afterward, by predicting the keypoints, a 3D bounding box of each vehicle is roughly restored. Then Region of Interest (ROI) is applied on both sides of left and right cameras to minimize the photometric error, so that the precise position and the size of the detection frame are obtained. In the meantime, a Mask-branch mechanism is adopted to achieve a precise semantic segmentation of each ROI. Finally, a 3D bounding box of vehicles is used for point-cloud segmentation, and the semantic information provided by the Mask-branch is exploited to improve the segmentation accuracy. Extensive numerical experiments are conducted on the Kitti dataset and nuScenes dataset to demonstrate the efficiency and effectiveness of the proposed algorithm. (C) 2021 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available