Journal
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Volume 45, Issue 3, Pages 3072-3089Publisher
IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2022.3172932
Keywords
Object tracking; video object segmentation; multiple object tracking; siamese network
Ask authors/readers for more resources
In this article, SiamMask, a framework for real-time visual object tracking and video object segmentation, using the same simple method, is introduced. The offline training procedure of popular fully-convolutional Siamese approaches is improved by adding a binary segmentation task. Once the offline training is completed, SiamMask can perform visual object tracking and segmentation at high frame-rates with only a single bounding box for initialization. The framework can also handle multiple object tracking and segmentation by re-using the multi-task model in a cascaded fashion.
In this article, we introduce SiamMask, a framework to perform both visual object tracking and video object segmentation, in real-time, with the same simple method. We improve the offline training procedure of popular fully-convolutional Siamese approaches by augmenting their losses with a binary segmentation task. Once the offline training is completed, SiamMask only requires a single bounding box for initialization and can simultaneously carry out visual object tracking and segmentation at high frame-rates. Moreover, we show that it is possible to extend the framework to handle multiple object tracking and segmentation by simply re-using the multi-task model in a cascaded fashion. Experimental results show that our approach has high processing efficiency, at around 55 frames per second. It yields real-time state-of-the art results on visual-object tracking benchmarks, while at the same time demonstrating competitive performance at a high speed for video object segmentation benchmarks.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available