4.6 Article

VMLH: Efficient Video Moment Location via Hashing

期刊

ELECTRONICS
卷 12, 期 2, 页码 -

出版社

MDPI
DOI: 10.3390/electronics12020420

关键词

moment localization; video understanding; hashing; video grounding

向作者/读者索取更多资源

Video-moment location by query is a hot topic in video understanding. In this study, we propose an efficient video moment location method via hashing. By encoding query sentences and video clips into hash codes, we predict the corresponding timestamp based on the similarity among hash codes, improving location efficiency without real-time input of video clips.
Video-moment location by query is a hot topic in video understanding. However, most of the existing methods ignore the importance of location efficiency in practical application scenarios; video and query sentences have to be fed into the network at the same time during the retrieval, which leads to low efficiency. To address this issue, in this study, we propose an efficient video moment location via hashing (VMLH). In the proposed method, query sentences and video clips are, respectively, converted into hash codes and hash code sets, in which the semantic similarity between query sentences and video clips is preserved. The location prediction network is designed to predict the corresponding timestamp according to the similarity among hash codes, and the videos do not need to be fed into the network during the process of retrieval and location. Furthermore, different from the existing methods, which require complex interactions and fusion between video and query sentences, the proposed VMLH method only needs a simple XOR operation among codes to locate the video moment with high efficiency. This paper lays the foundation for fast video clip positioning and makes it possible to apply large-scale video clip positioning in practice. The experimental results on two public datasets demonstrate the effectiveness of the method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据