4.7 Article

Local Correspondence Network for Weakly Supervised Temporal Sentence Grounding

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 30, 期 -, 页码 3252-3262

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2021.3058614

关键词

Grounding; Annotations; Two dimensional displays; Training; Feature extraction; Computational modeling; Task analysis; Weakly supervised; temporal sentence grounding

资金

  1. National Key Research and Development Program [2018YFB0804204]
  2. Strategic Priority Research Program of Chinese Academy of Sciences [XDC02050500]
  3. National Natural Science Foundation of China [62022078, 62021001]
  4. Youth Innovation Promotion Association CAS [2018166]
  5. Open Project Program of the National Laboratory of Pattern Recognition (NLPR) [202000019]

向作者/读者索取更多资源

LCNet utilizes hierarchical representation of video and text features and introduces a self-supervised cycle-consistent loss to effectively learn the matching relationships between video and text, achieving superior performance compared to existing weakly supervised methods.
Weakly supervised temporal sentence grounding has better scalability and practicability than fully supervised methods in real-world application scenarios. However, most of existing methods cannot model the fine-grained video-text local correspondences well and do not have effective supervision information for correspondence learning, thus yielding unsatisfying performance. To address the above issues, we propose an end-to-end Local Correspondence Network (LCNet) for weakly supervised temporal sentence grounding. The proposed LCNet enjoys several merits. First, we represent video and text features in a hierarchical manner to model the fine-grained video-text correspondences. Second, we design a self-supervised cycle-consistent loss as a learning guidance for video and text matching. To the best of our knowledge, this is the first work to fully explore the fine-grained correspondences between video and text for temporal sentence grounding by using self-supervised learning. Extensive experimental results on two benchmark datasets demonstrate that the proposed LCNet significantly outperforms existing weakly supervised methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据