Related references
Note: Only part of the references are listed.Exploiting Subspace Relation in Semantic Labels for Cross-Modal Hashing
Heng Tao Shen et al.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2021)
Cross-Modal Attention With Semantic Consistence for Image-Text Matching
Xing Xu et al.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2020)
More Is Better: Precise and Detailed Image Captioning Using Online Positive Recall and Missing Concepts Mining
Mingxing Zhang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)
Describing Video With Attention-Based Bidirectional LSTM
Yi Bin et al.
IEEE TRANSACTIONS ON CYBERNETICS (2019)
Word-to-region attention network for visual question answering
Liang Peng et al.
MULTIMEDIA TOOLS AND APPLICATIONS (2019)
CRA-Net: Composed Relation Attention Network for Visual Question Answering
Liang Peng et al.
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) (2019)
Video Captioning by Adversarial LSTM
Yang Yang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2018)
Robust discrete code modeling for supervised hashing
Yadan Luo et al.
PATTERN RECOGNITION (2018)
Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering
Zhou Yu et al.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2018)
Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval
Xing Xu et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2017)
Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
Ranjay Krishna et al.
INTERNATIONAL JOURNAL OF COMPUTER VISION (2017)
An Analysis of Visual Question Answering Algorithms
Kushal Kafle et al.
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)
Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering
Yash Goyal et al.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)
Adaptively Attending to Visual Attributes and Linguistic Knowledge for Captioning
Yi Bin et al.
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17) (2017)
BidirectionalLong-Short Term Memory for Video Description
Yi Bin et al.
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE (2016)
VQA: Visual Question Answering
Stanislaw Antol et al.
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2015)