4.7 Article

Visual Saliency Prediction Using a Mixture of Deep Neural Networks

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING
Volume 27, Issue 8, Pages 4080-4090

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2018.2834826

Keywords

Visual attention; human visual system; saliency map; deep learning

Ask authors/readers for more resources

Visual saliency models have recently begun to incorporate deep learning to achieve predictive capacity much greater than previous unsupervised methods. However, most existing models predict saliency without explicit knowledge of global scene semantic information. We propose a model (MxSalNet) that incorporates global scene semantic information in addition to local information gathered by a convolutional neural network. Our model is formulated as a mixture of experts. Each expert network is trained to predict saliency for a set of closely related images. The final saliency map is computed as a weighted mixture of the expert networks' output, with weights determined by a separate gating network. This gating network is guided by global scene information to predict weights. The expert networks and the gating network are trained simultaneously in an end-to-end manner. We show that our mixture formulation leads to improvement in performance over an otherwise identical non-mixture model that does not incorporate global scene information. Additionally, we show that our model achieves better performance than several other visual saliency models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available