Related references
Note: Only part of the references are listed.
Article
Computer Science, Artificial Intelligence
Xiangpeng Li et al.
Summary: This study addresses the TextVQA problem and proposes a novel Text-Instance Graph (TIG) network to tackle the challenge. TIG models relationships between objects by building an OCR-OBJ graph and introduces a dynamic OCR-OBJ graph network to handle complex logic questions. Experimental results demonstrate the superior effectiveness of the proposed method compared to existing approaches.
PATTERN RECOGNITION
(2022)
Article
Computer Science, Artificial Intelligence
Zongwen Bai et al.
Summary: The study proposed a novel comprehensive solution to compress and accelerate Visual Question Answering systems. By applying various decomposition methods and regression strategies, the Fully Connected layers in Convolutional Neural Network and Long Short Term Memory were successfully compressed, achieving high compression ratios with minimal accuracy drop.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Badri N. Patro et al.
Summary: This paper proposes a probabilistic framework for solving the task of 'Visual Dialog', aiming to understand and analyze the sources of uncertainty for solving this task. The proposed probabilistic framework leads to an improved and more explainable visual dialog system.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Lei Zhao et al.
Summary: Visual dialog is a task involving two agents communicating in natural language with information asymmetry. A novel approach based on an attentive memory network is proposed to fully utilize image and historical dialog information. Experimental results demonstrate the effectiveness of this method in the visual dialog task, outperforming existing state-of-the-art methods.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Moshiur Farazi et al.
Summary: This paper systematically studies the trade-off between model complexity and performance in VQA models, with a specific focus on the impact of multi-modal fusion. Through thorough experimental evaluation, three proposals are presented, optimized for minimal complexity, balanced complexity-accuracy, and state-of-the-art VQA performance.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Silvio Barra et al.
Summary: Visual Question Answering (VQA) is a challenging research area that requires a combination of computer vision and natural language processing abilities. Unlike other visual tasks, VQA requires comparing the semantics of images or videos with questions posed in natural language. Recent research has focused on image processing, language processing methods, and approaches to information fusion.
PATTERN RECOGNITION LETTERS
(2021)
Article
Computer Science, Artificial Intelligence
Yun Liu et al.
Summary: Visual Question Answering (VQA) is an important task in understanding vision and language. A novel model, DSACA, was proposed to address the integration problem between local features and global dependencies, using dual self-attention with co-attention networks.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Artificial Intelligence
Feng Liu et al.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2020)
Article
Computer Science, Artificial Intelligence
Jing Yu et al.
PATTERN RECOGNITION
(2020)
Article
Computer Science, Artificial Intelligence
Zhiwei Fang et al.
PATTERN RECOGNITION
(2019)
Article
Computer Science, Artificial Intelligence
Peng Wang et al.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2018)
Article
Computer Science, Artificial Intelligence
Qi Wu et al.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2018)
Article
Multidisciplinary Sciences
Donald Geman et al.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2015)