4.7 Article

Accuracy vs. complexity: A trade-off in visual question answering models

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

Probabilistic framework for solving visual dialog

Badri N. Patro et al.

Summary: This paper proposes a probabilistic framework for solving the task of 'Visual Dialog', aiming to understand and analyze the sources of uncertainty for solving this task. The proposed probabilistic framework leads to an improved and more explainable visual dialog system.

PATTERN RECOGNITION (2021)

Article Computer Science, Artificial Intelligence

Dual self-attention with co-attention networks for visual question answering

Yun Liu et al.

Summary: Visual Question Answering (VQA) is an important task in understanding vision and language. A novel model, DSACA, was proposed to address the integration problem between local features and global dependencies, using dual self-attention with co-attention networks.

PATTERN RECOGNITION (2021)

Article Computer Science, Artificial Intelligence

From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts

Moshiur R. Farazi et al.

IMAGE AND VISION COMPUTING (2020)

Article Computer Science, Artificial Intelligence

Cross-modal knowledge reasoning for knowledge-based visual question answering

Jing Yu et al.

PATTERN RECOGNITION (2020)

Article Computer Science, Artificial Intelligence

Improving visual question answering using dropout and enhanced question encoder

Zhiwei Fang et al.

PATTERN RECOGNITION (2019)

Article Computer Science, Artificial Intelligence

FVQA: Fact-Based Visual Question Answering

Peng Wang et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2018)

Article Computer Science, Artificial Intelligence

Recent advances in convolutional neural networks

Jiuxiang Gu et al.

PATTERN RECOGNITION (2018)

Article Computer Science, Artificial Intelligence

Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering

Zhou Yu et al.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2018)

Article Computer Science, Artificial Intelligence

Image Understanding using vision and reasoning through Scene Description Graph

Somak Aditya et al.

COMPUTER VISION AND IMAGE UNDERSTANDING (2018)

Article Computer Science, Artificial Intelligence

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Ranjay Krishna et al.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2017)

Article Computer Science, Hardware & Architecture

YFCC100M: The New Data in Multimedia Research

Bart Thomee et al.

COMMUNICATIONS OF THE ACM (2016)