Journal
APPLIED INTELLIGENCE
Volume 51, Issue 6, Pages 3450-3459Publisher
SPRINGER
DOI: 10.1007/s10489-020-01961-4
Keywords
Multi-scale; Video salient object detection; Attention; Pyramid
Categories
Ask authors/readers for more resources
This paper systematically studies the role of spatial and temporal attention mechanism in video salient object detection, proposing a two-stage spatial-temporal attention network called STA-Net. By utilizing Multi-Scale-Spatial-Attention and Pyramid-Saliency-Shift-Aware modules, the network efficiently exploits multi-scale saliency information and dynamic object information, achieving compelling performance in video salient object detection task.
This paper conducts a systematic study on the role of spatial and temporal attention mechanism in the video salient object detection (VSOD) task. We present a two-stage spatial-temporal attention network, named STA-Net, which makes two major contributions. In the first stage, we devise a Multi-Scale-Spatial-Attention (MSSA) module to reduce calculation cost on non-salient regions while exploiting multi-scale saliency information. Such a sliced attention method offers an individual way to efficiently exploit the high-level features of the network with an enlarged receptive field. The second stage is to propose a Pyramid-Saliency-Shift-Aware (PSSA) module, which puts emphasis on the importance of dynamic object information since it offers a valid shift cue to confirm salient object and capture temporal information. Such a temporal detection module is able to encourage precise salient region detection. Exhaustive experiments show that the proposed STA-Net is effective for video salient object detection task, and achieves compelling performance in comparison with state-of-the-art.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available