4.7 Article

Learning joint relationship attention network for image captioning

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Computer Science, Artificial Intelligence

Image captioning with adaptive incremental global context attention

Changzhi Wang et al.

Summary: The paper proposes a novel adaptive incremental global context attention (IGCA) method to address the issue of capturing global information in image captioning tasks, and conducts extensive experiments on multiple public datasets, showing significant improvement and achieving state-of-the-art performance.

APPLIED INTELLIGENCE (2022)

Article Computer Science, Information Systems

Visual relationship detection with region topology structure

Le Zhang et al.

Summary: Visual relationship detection is crucial for scene understanding, our proposed method utilizes visual, positional, and semantic features, as well as regional topology structure, achieving good performance in visual relationship detection tasks.

INFORMATION SCIENCES (2021)

Article Computer Science, Artificial Intelligence

Image captioning with transformer and knowledge graph

Yu Zhang et al.

Summary: This paper applies the Transformer model to image captioning tasks and improves its performance in two aspects by adding a KL divergence term and leveraging knowledge graphs. Experimental results on benchmark datasets show the effectiveness of the proposed method.

PATTERN RECOGNITION LETTERS (2021)

Article Engineering, Electrical & Electronic

Noise Augmented Double-Stream Graph Convolutional Networks for Image Captioning

Lingxiang Wu et al.

Summary: The proposed NADGCN model utilizes grid-stream GCN as a supplement to the region stream and enhances the generalization of the language model by adding a noise module. Experimental results show that it outperforms the comparative baseline models.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)

Article Engineering, Electrical & Electronic

Learning Dual Semantic Relations With Graph Attention for Image-Text Matching

Keyu Wen et al.

Summary: In this work, a novel multi-level semantic relations enhancement approach named DSRAN is proposed to address the issue of mismatch between regional features and global features in image-text matching. DSRAN consists of two modules, performing graph attention for region-level relations enhancement and regional-global relations enhancement simultaneously. The experimental results show that DSRAN outperforms previous approaches by a large margin, demonstrating the effectiveness of the dual semantic relations learning scheme.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)

Article Computer Science, Artificial Intelligence

Deep Relation Embedding for Cross-Modal Retrieval

Yifan Zhang et al.

Summary: Cross-modal retrieval is achieved through a Cross-modal Relation Guided Network (CRGN) for measuring the similarity between images and text sentences. By learning global feature guiding and sentence generation, the relation between image regions is modeled, leading to efficient retrieval between image and text.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Article Computer Science, Information Systems

Fine-Grained Image Captioning With Global-Local Discriminative Objective

Jie Wu et al.

Summary: In the field of image captioning, a novel global-local discriminative objective is proposed to generate fine-grained descriptive captions. The method outperforms baseline methods on the widely used MS-COCO dataset and competes with existing leading approaches.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Computer Science, Information Systems

Integrating Part of Speech Guidance for Image Captioning

Ji Zhang et al.

Summary: The paper proposes an integrated image captioning method that incorporates part of speech information, using a part of speech prediction network within an encoder-decoder framework, and multi-task learning to generate captions with more accurate visual information and better compliance with language habits and grammar rules.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Computer Science, Artificial Intelligence

Stimulus-driven and concept-driven analysis for image caption generation

Songtao Ding et al.

NEUROCOMPUTING (2020)

Article Computer Science, Artificial Intelligence

Learning visual relationship and context-aware attention for image captioning

Junbo Wang et al.

PATTERN RECOGNITION (2020)

Article Computer Science, Information Systems

Constrained LSTM and Residual Attention for Image Captioning

Liang Yang et al.

ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS (2020)

Article Computer Science, Artificial Intelligence

Image captioning with semantic-enhanced features and extremely hard negative examples

Wenjie Cai et al.

NEUROCOMPUTING (2020)

Article Computer Science, Artificial Intelligence

Re-Caption: Saliency-Enhanced Image Captioning Through Two-Phase Learning

Lian Zhou et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2020)

Article Computer Science, Artificial Intelligence

VD-SAN: Visual-Densely Semantic Attention Network for Image Caption Generation

Xinwei He et al.

NEUROCOMPUTING (2019)

Article Computer Science, Artificial Intelligence

A multimodal fusion approach for image captioning

Dexin Zhao et al.

NEUROCOMPUTING (2019)

Article Automation & Control Systems

Web image annotation based on Tri-relational Graph and semantic context analysis

Jing Zhang et al.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2019)

Article Computer Science, Information Systems

Know More Say Less: Image Captioning Based on Scene Graphs

Xiangyang Li et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Article Computer Science, Information Systems

Deep Hierarchical Encoder-Decoder Network for Image Captioning

Xinyu Xiao et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Hierarchy Parsing for Image Captioning

Ting Yao et al.

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Look Back and Predict Forward in Image Captioning

Yu Qin et al.

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Self-critical n-step Training for Image Captioning

Junlong Gao et al.

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) (2019)

Article Computer Science, Artificial Intelligence

Attentive Linear Transformation for Image Captioning

Senmao Ye et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2018)

Article Computer Science, Information Systems

GLA: Global-Local Attention for Image Description

Linghui Li et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2018)

Article Computer Science, Artificial Intelligence

Image captioning with triple-attention and stack parallel LSTM

Xinxin Zhu et al.

NEUROCOMPUTING (2018)

Article Computer Science, Artificial Intelligence

Deep Visual-Semantic Alignments for Generating Image Descriptions

Andrej Karpathy et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2017)

Proceedings Paper Computer Science, Artificial Intelligence

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

Long Chen et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Article Automation & Control Systems

A feature selection method for author identification in interactive communications based on supervised learning and language typicality

Esther Villar-Rodriguez et al.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2016)

Proceedings Paper Computer Science, Artificial Intelligence

VQA: Visual Question Answering

Stanislaw Antol et al.

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2015)