Journal
IEEE TRANSACTIONS ON MULTIMEDIA
Volume 21, Issue 4, Pages 809-820Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2018.2867742
Keywords
Saliency model; 3D visual saliency; two-stage clustering
Categories
Funding
- National Key Research and Development Program of China [2017YFC0806202]
- National Natural Science Foundation of China [61571204, 61471178]
Ask authors/readers for more resources
Three-dimensional (3D) visual saliency is fundamental for vision-guided applications such as human-computer interaction in virtual reality, image quality assessment, object tracking, and event retrieval. Classical models for 3D visual saliency can draw an appropriate saliency map when the quality of the required depth maps or auxiliary cues is high enough. However, the depth map is usually impaired with artifacts (such as holes or noise) from faults in stereo matching or multipaths in range sensors. In these cases, challenges arise in those 3D visual saliency models because the core preliminary processes, such as the detection of low-level visual features, may fail. To solve this problem, we proposed a two-stage clustering-based 3D visual saliency model for human visual fixation prediction in dynamic scenarios. In this model, a two-stage clustering scheme is designed to handle the negative influence of impaired depth videos. With the help of this scheme, representative cues are selected for saliency modeling. After that, multimodal saliency maps are obtained from depth, color, and 3D motion cues. Finally, a cross-Bayesian model is designed for the pooling of multimodal saliency maps. The experimental results demonstrate that the proposed 3D saliency model based on two-stage clustering outperforms other state-of-the-art models on a variety of metrics. Furthermore, the consistency and robustness of our model are also verified.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available