4.7 Article

Research on visual question answering based on dynamic memory network model of multiple attention mechanisms

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

Visual question answering: Which investigated applications?

Silvio Barra et al.

Summary: Visual Question Answering (VQA) is a challenging research area that requires a combination of computer vision and natural language processing abilities. Unlike other visual tasks, VQA requires comparing the semantics of images or videos with questions posed in natural language. Recent research has focused on image processing, language processing methods, and approaches to information fusion.

PATTERN RECOGNITION LETTERS (2021)

Article Computer Science, Information Systems

Deep Memory Network for Cross-Modal Retrieval

Ge Song et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Article Computer Science, Information Systems

Know More Say Less: Image Captioning Based on Scene Graphs

Xiangyang Li et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Article Computer Science, Artificial Intelligence

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Ranjay Krishna et al.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

Yash Goyal et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

Long Chen et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

MUTAN: Multimodal Tucker Fusion for Visual Question Answering

Hedi Ben-younes et al.

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)