Related references
Note: Only part of the references are listed.Task-Adaptive Attention for Image Captioning
Chenggang Yan et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)
Deep features for person re-identification on metric learning
Wanyin Wu et al.
PATTERN RECOGNITION (2021)
Enhancing the alignment between target words and corresponding frames for video captioning
Yunbin Tu et al.
PATTERN RECOGNITION (2021)
Improving Video Captioning with Temporal Composition of a Visual-Syntactic Embedding
Jesus Perez-Martin et al.
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021 (2021)
Vocabulary-Wide Credit Assignment for Training Image Captioning Models
Han Liu et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)
Cross-Domain Image Captioning via Cross-Modal Retrieval and Model Adaptation
Wentian Zhao et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)
Integrating Part of Speech Guidance for Image Captioning
Ji Zhang et al.
IEEE TRANSACTIONS ON MULTIMEDIA (2021)
STAT: Spatial-Temporal Attention Mechanism for Video Captioning
Chenggang Yan et al.
IEEE TRANSACTIONS ON MULTIMEDIA (2020)
Learning visual relationship and context-aware attention for image captioning
Junbo Wang et al.
PATTERN RECOGNITION (2020)
An Ensemble of Generation- and Retrieval-Based Image Captioning With Dual Generator Generative Adversarial Network
Min Yang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2020)
Video Captioning With Object-Aware Spatio-Temporal Correlation and Aggregation
Junchao Zhang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2020)
Domain-Weighted Majority Voting for Crowdsourcing
Dapeng Tao et al.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2019)
CAM-RNN: Co-Attention Model Based RNN for Video Captioning
Bin Zhao et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)
Watch, Listen and Tell: Multi-modal Weakly Supervised Dense Event Captioning
Tanzila Rahman et al.
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) (2019)
Controllable Video Captioning with POS Sequence Guidance Based on Gated Fusion Network
Bairui Wang et al.
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) (2019)
Video Captioning by Adversarial LSTM
Yang Yang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2018)
Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description
Xishan Zhang et al.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)
Attention-Based Multimodal Fusion for Video Description
Chiori Hori et al.
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Joao Carreira et al.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky et al.
INTERNATIONAL JOURNAL OF COMPUTER VISION (2015)