4.5 Article

Multimodal vehicle detection: fusing 3D-LIDAR and color camera data

Journal

PATTERN RECOGNITION LETTERS
Volume 115, Issue -, Pages 20-29

Publisher

ELSEVIER
DOI: 10.1016/j.patrec.2017.09.038

Keywords

Multimodal data; Deep learning; Object detection; Fusion

Funding

  1. European Union (INEA-CEF) [2015-EU-TM-0243-S]
  2. FEDER through COMPETE [UID/EEA/00048, RECI/EEI-AUT/0181/2012 (AMS-HMI12)]

Ask authors/readers for more resources

Most of the current successful object detection approaches are based on a class of deep learning models called Convolutional Neural Networks (ConvNets). While most existing object detection researches are focused on using ConvNets with color image data, emerging fields of application such as Autonomous Vehicles (AVs) which integrates a diverse set of sensors, require the processing for multisensor and multimodal information to provide a more comprehensive understanding of real-world environment. This paper proposes a multimodal vehicle detection system integrating data from a 3D-LIDAR and a color camera. Data from LIDAR and camera, in the form of three modalities, are the inputs of ConvNet-based detectors which are later combined to improve vehicle detection. The modalities are: (i) up-sampled representation of the sparse LIDAR's range data called dense-Depth Map (DM), (ii) high-resolution map from LIDAR's reflectance data hereinafter called Reflectance Map (RM), and (iii) RGB image from a monocular color camera calibrated wrt the LIDAR. Bounding Box (BB) detections in each one of these modalities are jointly learned and fused by an Artificial Neural Network (ANN) late-fusion strategy to improve the detection performance of each modality. The contribution of this paper is two-fold: (1) probing and evaluating 3D-LIDAR modalities for vehicle detection (specifically the depth and reflectance map modalities), and (2) joint learning and fusion of the independent ConvNet-based vehicle detectors (in each modality) using an ANN to obtain more accurate vehicle detection. The obtained results demonstrate that (1) DM and RM are very promising modalities for vehicle detection, and (2) experiments show that the proposed fusion strategy achieves higher accuracy compared to each modality alone in all the levels of increasing difficulty (easy, moderate, hard) in KITTI object detection dataset. (C) 2017 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available