4.6 Article

FuseSeg: Semantic Segmentation of Urban Scenes Based on RGB and Thermal Data Fusion

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TASE.2020.2993143

Keywords

Semantics; Image segmentation; Cameras; Lighting; Laser radar; Data integration; Autonomous driving; information fusion; semantic segmentation; thermal images; urban scenes

Funding

  1. National Natural Science Foundation of China [U1713211]
  2. Research Grant Council of Hong Kong [11210017]

Ask authors/readers for more resources

Semantic segmentation of urban scenes is crucial for autonomous driving applications, and recent advancements in deep learning have led to improved performance. However, traditional networks using single-modal sensory data may struggle in challenging lighting conditions. This article introduces the FuseSeg network, which fuses RGB and thermal data to achieve superior segmentation performance in urban scenes. The network outperforms state-of-the-art networks and can be easily implemented using various deep learning frameworks.
Semantic segmentation of urban scenes is an essential component in various applications of autonomous driving. It makes great progress with the rise of deep learning technologies. Most of the current semantic segmentation networks use single-modal sensory data, which are usually the RGB images produced by visible cameras. However, the segmentation performance of these networks is prone to be degraded when lighting conditions are not satisfied, such as dim light or darkness. We find that thermal images produced by thermal imaging cameras are robust to challenging lighting conditions. Therefore, in this article, we propose a novel RGB and thermal data fusion network named FuseSeg to achieve superior performance of semantic segmentation in urban scenes. The experimental results demonstrate that our network outperforms the state-of-the-art networks. Note to Practitioners-This article investigates the problem of semantic segmentation of urban scenes when lighting conditions are not satisfied. We provide a solution to this problem via information fusion with RGB and thermal data. We build an end-to-end deep neural network, which takes as input a pair of RGB and thermal images and outputs pixel-wise semantic labels. Our network could be used for urban scene understanding, which serves as a fundamental component of many autonomous driving tasks, such as environment modeling, obstacle avoidance, motion prediction, and planning. Moreover, the simple design of our network allows it to be easily implemented using various deep learning frameworks, which facilitates the applications on different hardware or software platforms.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available