4.5 Article

Sparse co-attention visual question answering networks based on thresholds

Related references

Note: Only part of the references are listed.
Review Computer Science, Theory & Methods

Deep Learning-based Text Classification: A Comprehensive Review

Shervin Minaee et al.

Summary: This article provides a comprehensive review of over 150 deep learning-based models for text classification developed in recent years. It discusses their technical contributions, similarities, and strengths, as well as summarizes popular datasets used for text classification. The article also includes a quantitative analysis of the performance of different deep learning models on popular benchmarks and discusses future research directions.

ACM COMPUTING SURVEYS (2022)

Article Computer Science, Artificial Intelligence

Rich Visual Knowledge-Based Augmentation Network for Visual Question Answering

Liyang Zhang et al.

Summary: The new framework KAN utilizes object-related knowledge and a knowledge graph to assist in the reasoning process of VQA, with an attention module that adaptively balances the importance of external knowledge against detected objects. Extensive experiments demonstrate that KAN achieves state-of-the-art performance on challenging VQA datasets and provides benefits to VQA baselines.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2021)

Article Acoustics

Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition

Cunhang Fan et al.

Summary: This paper proposes a gated recurrent fusion (GRF) method with joint training framework for robust end-to-end automatic speech recognition, which dynamically combines noisy and enhanced features to alleviate speech distortion and improve performance.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2021)

Article Computer Science, Artificial Intelligence

Text classification using capsules

Jaeyoung Kim et al.

NEUROCOMPUTING (2020)

Article Computer Science, Information Systems

Reasoning on the Relation: Enhancing Visual Representation for Visual Question Answering and Cross-Modal Retrieval

Jing Yu et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2020)

Proceedings Paper Acoustics

TRANSFORMER-BASED ONLINE CTC/ATTENTION END-TO-END SPEECH RECOGNITION ARCHITECTURE

Haoran Miao et al.

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (2020)

Article Computer Science, Artificial Intelligence

An Ensemble of Generation- and Retrieval-Based Image Captioning With Dual Generator Generative Adversarial Network

Min Yang et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2020)

Article Computer Science, Information Systems

Multimodal Encoder-Decoder Attention Networks for Visual Question Answering

Chongqing Chen et al.

IEEE ACCESS (2020)

Article Acoustics

Unsupervised Neural Machine Translation With Cross-Lingual Language Representation Agreement

Haipeng Sun et al.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2020)

Article Acoustics

Towards More Diverse Input Representation for Neural Machine Translation

Kehai Chen et al.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2020)

Article Computer Science, Artificial Intelligence

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

Liwei Wang et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2019)

Article Computer Science, Artificial Intelligence

Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering

Yash Goyal et al.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2019)

Article Computer Science, Artificial Intelligence

Topic-Oriented Image Captioning Based on Order-Embedding

Niange Yu et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)

Article Computer Science, Information Systems

Co-Attention Network With Question Type for Visual Question Answering

Chao Yang et al.

IEEE ACCESS (2019)

Article Computer Science, Artificial Intelligence

VQA: Visual Question Answering

Aishwarya Agrawal et al.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2017)

Article Computer Science, Artificial Intelligence

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2017)

Article Computer Science, Artificial Intelligence

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Ranjay Krishna et al.

INTERNATIONAL JOURNAL OF COMPUTER VISION (2017)