☆ 3.8 Proceedings Paper

Inferring and Executing Programs for Visual Reasoning

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)

期刊

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)

卷 -, 期 -, 页码 3008-3017

出版社

IEEE

DOI: 10.1109/ICCV.2017.325

关键词

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

ONR MURI grant

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes. As a result, these black-box models often learn to exploit biases in the data rather than learning to perform visual reasoning. Inspired by module networks, this paper proposes a model for visual reasoning that consists of a program generator that constructs an explicit representation of the reasoning process to be performed, and an execution engine that executes the resulting program to produce an answer. Both the program generator and the execution engine are implemented by neural networks, and are trained using a combination of backpropagation and REINFORCE. Using the CLEVR benchmark for visual reasoning, we show that our model significantly outperforms strong baselines and generalizes better in a variety of settings.

Inferring and Executing Programs for Visual Reasoning

期刊

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Inferring and Executing Programs for Visual Reasoning

期刊

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文