3.8 Proceedings Paper

Do Humans Look Where Deep Convolutional Neural Networks Attend?

期刊

出版社

SPRINGER INTERNATIONAL PUBLISHING AG
DOI: 10.1007/978-3-030-33723-0_5

关键词

Visual spatial attention; Computer vision; Convolutional Neural Networks; Densely connected attention maps; Class Activation Maps; Sensitivity analysis

向作者/读者索取更多资源

Deep Convolutional Neural Networks (CNNs) have recently begun to exhibit human level performance on some visual perception tasks. Performance remains relatively poor, however, on some vision tasks, such as object detection: specifying the location and object class for all objects in a still image. We hypothesized that this gap in performance may be largely due to the fact that humans exhibit selective attention, while most object detection CNNs have no corresponding mechanism. In examining this question, we investigated some well-known attention mechanisms in the deep learning literature, identifying their weaknesses and leading us to propose a novel attention algorithm called the Densely Connected Attention Model. We then measured human spatial attention, in the form of eye tracking data, during the performance of an analogous object detection task. By comparing the learned representations produced by various CNN architectures with that exhibited by human viewers, we identified some relative strengths and weaknesses of the examined computational attention mechanisms. Some CNNs produced attentional patterns somewhat similar to those of humans. Others focused processing on objects in the foreground. Still other CNN attentional mechanisms produced usefully interpretable internal representations. The resulting comparisons provide insights into the relationship between CNN attention algorithms and the human visual system.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据