相关参考文献
注意:仅列出部分参考文献,下载原文获取全部文献信息。Action-Centric Relation Transformer Network for Video Question Answering
Jipeng Zhang et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)
Learning Video Moment Retrieval Without a Single Annotated Video
Junyu Gao et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)
Human-Centric Spatio-Temporal Video Grounding With Visual Transformers
Zongheng Tang et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)
End-to-End Video Question-Answer Generation With Generator-Pretester Network
Hung-Ting Su et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)
Long-Term Video Question Answering via Multimodal Hierarchical Memory Attentive Networks
Ting Yu et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Peng Xu et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)
Structured Multi-Level Interaction Network for Video Moment Localization via Language Query
Hao Wang et al.
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)
Context-aware Biaffine Localizing Network for Temporal Sentence Grounding
Daizong Liu et al.
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)
Self-Guided Body Part Alignment With Relation Transformers for Occluded Person Re-Identification
Guanshuo Wang et al.
IEEE SIGNAL PROCESSING LETTERS (2021)
MABAN: Multi-Agent Boundary-Aware Network for Natural Language Moment Retrieval
Xiaoyang Sun et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)
Local Correspondence Network for Weakly Supervised Temporal Sentence Grounding
Wenfei Yang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)
Contour-Aware Loss: Boundary-Aware Learning for Salient Object Segmentation
Zixuan Chen et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)
Skeleton-Based Action Recognition With Focusing-Diffusion Graph Convolutional Networks
Jialin Gao et al.
IEEE SIGNAL PROCESSING LETTERS (2021)
Video Storytelling: Textual Summaries for Events
Junnan Li et al.
IEEE TRANSACTIONS ON MULTIMEDIA (2020)
Convolutional neural network with adaptive inferential framework for skeleton-based action recognition
Hong'en Huang et al.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION (2020)
Revisiting Anchor Mechanisms for Temporal Action Localization
Le Yang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2020)
Learning Semantics-Preserving Attention and Contextual Interaction for Group Activity Recognition
Yansong Tang et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)
Breaking Winner-Takes-All: Iterative-Winners-Out Networks for Weakly Supervised Temporal Action Localization
Runhao Zeng et al.
IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)
General Interaction-Aware Neural Network for Action Recognition
Jialin Gao et al.
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III (2019)
Language-driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model
Weining Wang et al.
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) (2019)
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
Zhu Zhang et al.
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19) (2019)
DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension
Kai Sun et al.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (2019)
Large-Scale Video Retrieval Using Image Queries
Andre Araujo et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2018)
Nonlinear Structural Hashing for Scalable Video Search
Zhixiang Chen et al.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2018)
Attentive Moment Retrieval in Videos
Meng Liu et al.
ACM/SIGIR PROCEEDINGS 2018 (2018)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren et al.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2017)
Dense-Captioning Events in Videos
Ranjay Krishna et al.
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks et al.
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)
Mask R-CNN
Kaiming He et al.
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)
TALL: Temporal Activity Localization via Language Query
Jiyang Gao et al.
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Joao Carreira et al.
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)
Learning Spatiotemporal Features with 3D Convolutional Networks
Du Tran et al.
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2015)