4.6 Article

CSMOT: Make One-Shot Multi-Object Tracking in Crowded Scenes Great Again

Journal

SENSORS
Volume 23, Issue 7, Pages -

Publisher

MDPI
DOI: 10.3390/s23073782

Keywords

one-shot; multi-object tracking; re-ID; coordinate attention; angle-center loss; data association

Ask authors/readers for more resources

The current popular one-shot multi-object tracking algorithms based on the joint detection and embedding paradigm have high inference speeds and accuracy, but are unstable in crowded scenes. The proposed CSMOT algorithm addresses the problems by enhancing the information interaction between channels, optimizing the re-ID branch with an angle-center loss, balancing the detection and re-ID tasks through redesigned feature dimensions, and introducing a simple and effective data association mechanism. Experimental results show that CSMOT achieves excellent tracking performance on multiple datasets and reduces the number of ID switches compared to the baseline.
The current popular one-shot multi-object tracking (MOT) algorithms are dominated by the joint detection and embedding paradigm, which have high inference speeds and accuracy, but their tracking performance is unstable in crowded scenes. Not only does the detection branch have difficulty in obtaining the accurate object position, but the ambiguous appearance of features extracted by the re-identification (re-ID) branch also leads to identity switches. Focusing on the above problems, this paper proposes a more robust MOT algorithm, named CSMOT, based on FairMOT. First, on the basis of the encoder-decoder network, a coordinate attention module is designed to enhance the information interaction between channels (horizontal and vertical coordinates), which improves its object-detection abilities. Then, an angle-center loss that effectively maximizes intra-class similarity is proposed to optimize the re-ID branch, and the extracted re-ID features are made more discriminative. We further redesign the re-ID feature dimension to balance the detection and re-ID tasks. Finally, a simple and effective data association mechanism is introduced, which associates each detection instead of just the high-score detections during the tracking process. The experimental results show that our one-shot MOT algorithm achieves excellent tracking performance on multiple public datasets and can be effectively applied to crowded scenes. In particular, CSMOT decreases the number of ID switches by 11.8% and 33.8% on the MOT16 and MOT17 test datasets, respectively, compared to the baseline.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available