☆ 4.5 Article

Multimodal vehicle detection: fusing 3D-LIDAR and color camera data

PATTERN RECOGNITION LETTERS (2018)

Journal

PATTERN RECOGNITION LETTERS

Volume 115, Issue -, Pages 20-29

Publisher

ELSEVIER

DOI: 10.1016/j.patrec.2017.09.038

Keywords

Multimodal data; Deep learning; Object detection; Fusion

Funding

European Union (INEA-CEF) [2015-EU-TM-0243-S]
FEDER through COMPETE [UID/EEA/00048, RECI/EEI-AUT/0181/2012 (AMS-HMI12)]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Most of the current successful object detection approaches are based on a class of deep learning models called Convolutional Neural Networks (ConvNets). While most existing object detection researches are focused on using ConvNets with color image data, emerging fields of application such as Autonomous Vehicles (AVs) which integrates a diverse set of sensors, require the processing for multisensor and multimodal information to provide a more comprehensive understanding of real-world environment. This paper proposes a multimodal vehicle detection system integrating data from a 3D-LIDAR and a color camera. Data from LIDAR and camera, in the form of three modalities, are the inputs of ConvNet-based detectors which are later combined to improve vehicle detection. The modalities are: (i) up-sampled representation of the sparse LIDAR's range data called dense-Depth Map (DM), (ii) high-resolution map from LIDAR's reflectance data hereinafter called Reflectance Map (RM), and (iii) RGB image from a monocular color camera calibrated wrt the LIDAR. Bounding Box (BB) detections in each one of these modalities are jointly learned and fused by an Artificial Neural Network (ANN) late-fusion strategy to improve the detection performance of each modality. The contribution of this paper is two-fold: (1) probing and evaluating 3D-LIDAR modalities for vehicle detection (specifically the depth and reflectance map modalities), and (2) joint learning and fusion of the independent ConvNet-based vehicle detectors (in each modality) using an ANN to obtain more accurate vehicle detection. The obtained results demonstrate that (1) DM and RM are very promising modalities for vehicle detection, and (2) experiments show that the proposed fusion strategy achieves higher accuracy compared to each modality alone in all the levels of increasing difficulty (easy, moderate, hard) in KITTI object detection dataset. (C) 2017 Elsevier B.V. All rights reserved.

Multimodal vehicle detection: fusing 3D-LIDAR and color camera data

Journal

PATTERN RECOGNITION LETTERS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multimodal vehicle detection: fusing 3D-LIDAR and color camera data

Journal

PATTERN RECOGNITION LETTERS

Publisher

ELSEVIER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper