☆ 4.7 Article

A Dilated Inception Network for Visual Saliency Prediction

IEEE TRANSACTIONS ON MULTIMEDIA (2020)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 22, 期 8, 页码 2163-2176

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2019.2947352

关键词

Visualization; Computational modeling; Predictive models; Feature extraction; Spatial resolution; Computer architecture; Solid modeling; Visual attention; saliency detection; eye fixation prediction; convolutional neural networks; dilated convolution; inception module

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

资金

Singapore Ministry of Education Tier-2 Fund [MOE2016-T2-2-057(S)]
Natural Science Foundation of China [61901236]
NTU start-up grant
MOE Tier-1 Research Grant [RG126/17 (S)]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Recently, with the advent of deep convolutional neural networks (DCNN), the improvements in visual saliency prediction research are impressive. One possible direction to approach the next improvement is to fully characterize the multi-scale saliency-influential factors with a computationally-friendly module in DCNN architectures. In this work, we propose an end-to-end dilated inception network (DINet) for visual saliency prediction. It captures multi-scale contextual features effectively with very limited extra parameters. Instead of utilizing parallel standard convolutions with different kernel sizes as the existing inception module, our proposed dilated inception module (DIM) uses parallel dilated convolutions with different dilation rates which can significantly reduce the computation load while enriching the diversity of receptive fields in feature maps. Moreover, the performance of our saliency model is further improved by using a set of linear normalization-based probability distribution distance metrics as loss functions. As such, we can formulate saliency prediction as a global probability distribution prediction task for better saliency inference instead of a pixel-wise regression problem. Experimental results on several challenging saliency benchmark datasets demonstrate that our DINet with proposed loss functions can achieve state-of-the-art performance with shorter inference time.

A Dilated Inception Network for Visual Saliency Prediction

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A Dilated Inception Network for Visual Saliency Prediction

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文