☆ 4.7 Article

Traffic Sign Detection and Recognition Using Multi-Frame Embedding of Video-Log Images

REMOTE SENSING (2023)

Journal

REMOTE SENSING

Volume 15, Issue 12, Pages -

Publisher

MDPI

DOI: 10.3390/rs15122959

Keywords

traffic sign; intelligent vehicle; long-tailed distribution; anomalies; embedding; information integration

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study uses YOLOv5 as a single classification detector for traffic sign localization, and proposes a hierarchical classification model (HCM) for specific classification, which reduces class imbalance significantly. To address the limitations of a single image, a training-free multi-frame information integration module (MIM) is constructed to extract the detection sequence of traffic signs based on the embedding generated by HCM. Experimental results demonstrate that the improved HCM-YOLOv5 achieves a 79.0 mAP on two publicly available datasets, TT100K and ONCE, exceeding the state-of-the-art methods, with an inference speed of 22.7 FPS. Moreover, MIM further enhances model performance by integrating multi-frame information with only a slight increase in computational resource consumption.

The detection and recognition of traffic signs is an essential component of intelligent vehicle perception systems, which use on-board cameras to sense traffic sign information. Unfortunately, issues such as long-tailed distribution, occlusion, and deformation greatly decrease the detector's performance. In this research, YOLOv5 is used as a single classification detector for traffic sign localization. Afterwards, we propose a hierarchical classification model (HCM) for the specific classification, which significantly reduces the degree of imbalance between classes without changing the sample size. To cope with the shortcomings of a single image, a training-free multi-frame information integration module (MIM) was constructed, which can extract the detection sequence of traffic signs based on the embedding generated by the HCM. The extracted temporal detection information is used for the redefinition of categories and confidence. At last, this research performed detection and recognition of the full class on two publicly available datasets, TT100K and ONCE. Experimental results show that the HCM-improved YOLOv5 has a mAP of 79.0 in full classes, which exceeds that of state-of-the-art methods, and achieves an inference speed of 22.7 FPS. In addition, MIM further improves model performance by integrating multi-frame information while only slightly increasing computational resource consumption.

Traffic Sign Detection and Recognition Using Multi-Frame Embedding of Video-Log Images

Journal

REMOTE SENSING

Publisher

MDPI

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Traffic Sign Detection and Recognition Using Multi-Frame Embedding of Video-Log Images

Journal

REMOTE SENSING

Publisher

MDPI

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper