4.5 Article

Correspondence between monkey visual cortices and layers of a saliency map model based on a deep convolutional neural network for representations of natural images

Journal

ENEURO
Volume 8, Issue 1, Pages -

Publisher

SOC NEUROSCIENCE
DOI: 10.1523/ENEURO.0200-20.2020

Keywords

-

Categories

Ask authors/readers for more resources

Recent studies have shown that saliency map models based on deep convolutional neural networks exhibit high performance in predicting attentional selection and human gaze. The correspondence between artificial and neural representations used for determining attentional selection and gaze location remains unknown, indicating the need for further research in this area. Trained DCNNs potentially provide insight into the perceptual mechanisms of biological visual systems, with a focus on the role of neural representations in V1 in computing saliency for attentional selection.
Attentional selection is a function that allocates the brain's computational resources to the most important part of a visual scene at a specific moment. Saliency map models have been proposed as computational models to predict attentional selection within a spatial location. Recent saliency map models based on deep convolutional neural networks (DCNNs) exhibit the highest performance for predicting the location of attentional selection and human gaze, which reflect overt attention. Trained DCNNs potentially provide insight into the perceptual mechanisms of biological visual systems. However, the relationship between artificial and neural representations used for determining attentional selection and gaze location remains unknown. To understand the mechanism underlying saliency map models based on DCNNs and the neural system of attentional selection, we investigated the correspondence between layers of a DCNN saliency map model and monkey visual areas for natural image representations. We compared the characteristics of the responses in each layer of the model with those of the neural representation in the primary visual (V1), intermediate visual (V4), and inferior temporal cortices. Regardless of the DCNN layer level, the characteristics of the responses were consistent with that of the neural representation in V1. We found marked peaks of correspondence between V1 and the early level and higher-intermediate-level layers of the model. These results provide insight into the mechanism of the trained DCNN saliency map model and suggest that the neural representations in V1 play an important role in computing the saliency that mediates attentional selection, which supports the V1 saliency hypothesis.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available