☆ 3.8 Proceedings Paper

Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE (2016)

期刊

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE

卷 -, 期 -, 页码 781-790

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/2964284.2964308

关键词

Temporal Hashing; Binary LSTM; Sequence Learning; Video Retrieval

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

We focus on hashing videos into short binary codes for efficient Content-based Video Retrieval (CBVR), which is a fundamental technique that supports access to the evergrowing abundance of videos on the Web. Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework called Self-Supervised Temporal Hashing (SSTH) that is able to capture the temporal nature of videos in an end-to-end learning-to-hash fashion. Specifically, the hash function of SSTH is an encoder RNN equipped with the proposed Binary LSTM (BLSTM) that generates binary codes for videos. The hash function is learned in a self-supervised fashion, where a decoder RNN is proposed to reconstruct the original video frames in both forward and reverse orders. For binary code optimization, we develop a backpropagation rule that tackles the non-differentiability of BLSTM. This rule allows efficient deep network training without suffering from the binarization loss. Through extensive CBVR experiments on two real-world consumer video datasets of Youtube and Flickr, we show that SSTH consistently outperforms state-of-theart video hashing methods, e.g., in terms of mAP@20, SSTH using only 128 bits can still outperform others using 256 bits by at least 9% to 15% on both datasets.

Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing

期刊

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing

期刊

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文