4.7 Article

Traffic Sign Detection and Recognition Using Multi-Frame Embedding of Video-Log Images

Journal

REMOTE SENSING
Volume 15, Issue 12, Pages -

Publisher

MDPI
DOI: 10.3390/rs15122959

Keywords

traffic sign; intelligent vehicle; long-tailed distribution; anomalies; embedding; information integration

Ask authors/readers for more resources

The study uses YOLOv5 as a single classification detector for traffic sign localization, and proposes a hierarchical classification model (HCM) for specific classification, which reduces class imbalance significantly. To address the limitations of a single image, a training-free multi-frame information integration module (MIM) is constructed to extract the detection sequence of traffic signs based on the embedding generated by HCM. Experimental results demonstrate that the improved HCM-YOLOv5 achieves a 79.0 mAP on two publicly available datasets, TT100K and ONCE, exceeding the state-of-the-art methods, with an inference speed of 22.7 FPS. Moreover, MIM further enhances model performance by integrating multi-frame information with only a slight increase in computational resource consumption.
The detection and recognition of traffic signs is an essential component of intelligent vehicle perception systems, which use on-board cameras to sense traffic sign information. Unfortunately, issues such as long-tailed distribution, occlusion, and deformation greatly decrease the detector's performance. In this research, YOLOv5 is used as a single classification detector for traffic sign localization. Afterwards, we propose a hierarchical classification model (HCM) for the specific classification, which significantly reduces the degree of imbalance between classes without changing the sample size. To cope with the shortcomings of a single image, a training-free multi-frame information integration module (MIM) was constructed, which can extract the detection sequence of traffic signs based on the embedding generated by the HCM. The extracted temporal detection information is used for the redefinition of categories and confidence. At last, this research performed detection and recognition of the full class on two publicly available datasets, TT100K and ONCE. Experimental results show that the HCM-improved YOLOv5 has a mAP of 79.0 in full classes, which exceeds that of state-of-the-art methods, and achieves an inference speed of 22.7 FPS. In addition, MIM further improves model performance by integrating multi-frame information while only slightly increasing computational resource consumption.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available