期刊
INTERNATIONAL JOURNAL OF COMPUTER VISION
卷 113, 期 1, 页码 67-79出版社
SPRINGER
DOI: 10.1007/s11263-014-0765-x
关键词
Deep learning; Attention-based recognition; Neural networks; Neural autoregressive distribution estimator
资金
- Natural Sciences and Engineering Research Council of Canada
- National Natural Science Foundation [NNSF-61171118]
- Ministry of Education of China [SRFDP-20110002110057]
Tasks that require the synchronization of perception and action are incredibly hard and pose a fundamental challenge to the fields of machine learning and computer vision. One important example of such a task is the problem of performing visual recognition through a sequence of controllable fixations; this requires jointly deciding what inference to perform from fixations and where to perform these fixations. While these two problems are challenging when addressed separately, they become even more formidable if solved jointly. Recently, a restricted Boltzmann machine (RBM) model was proposed that could learn meaningful fixation policies and achieve good recognition performance. In this paper, we propose an alternative approach based on a feed-forward, auto-regressive architecture, which permits exact calculation of training gradients (given the fixation sequence), unlike for the RBM model. On a problem of facial expression recognition, we demonstrate the improvement gained by this alternative approach. Additionally, we investigate several variations of the model in order to shed some light on successful strategies for fixation-based recognition.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据