4.6 Article

DeepRare: Generic Unsupervised Visual Attention Models

期刊

ELECTRONICS
卷 11, 期 11, 页码 -

出版社

MDPI
DOI: 10.3390/electronics11111696

关键词

eye tracking; deep features; odd one out; rarity; saliency; visual attention prediction; visibility

资金

  1. ARES-CCD (program AI 2014-2019) by Belgian university cooperation

向作者/读者索取更多资源

The article introduces a new visual attention model called DeepRare2021 (DR21), which combines the advantages of deep learning and feature engineering. Compared to traditional DNN models, DR21 demonstrates higher efficiency and generality in extracting surprising or unusual data. The model does not require additional training and performs well on multiple eye-tracking datasets.
Visual attention selects data considered as interesting by humans, and it is modeled in the field of engineering by feature-engineered methods finding contrasted/surprising/unusual image data. Deep learning drastically improved the models efficiency on the main benchmark datasets. However, Deep Neural Networks-based (DNN-based) models are counterintuitive: surprising or unusual data are by definition difficult to learn because of their low occurrence probability. In reality, DNN-based models mainly learn top-down features such as faces, text, people, or animals which usually attract human attention, but they have low efficiency in extracting surprising or unusual data in the images. In this article, we propose a new family of visual attention models called DeepRare and especially DeepRare2021 (DR21), which uses the power of DNNs' feature extraction and the genericity of feature-engineered algorithms. This algorithm is an evolution of a previous version called DeepRare2019 (DR19) based on this common framework. DR21 (1) does not need any additional training other than the default ImageNet training, (2) is fast even on CPU, (3) is tested on four very different eye-tracking datasets showing that DR21 is generic and is always within the top models on all datasets and metrics while no other model exhibits such a regularity and genericity. Finally, DR21 (4) is tested with several network architectures such as VGG16 (V16), VGG19 (V19), and MobileNetV2 (MN2), and (5) it provides explanation and transparency on which parts of the image are the most surprising at different levels despite the use of a DNN-based feature extractor.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据