☆ 3.8 Proceedings Paper

BidirectionalLong-Short Term Memory for Video Description

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE (2016)

Journal

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE

Volume -, Issue -, Pages 436-440

Publisher

ASSOC COMPUTING MACHINERY

DOI: 10.1145/2964284.2967258

Keywords

Video captioning; bidirectional long-short; term memory

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Video captioning has been attracting broad research attention in multimedia community. However, most existing approaches either ignore temporal information among video frames or just employ local contextual temporal knowledge. In this work, we propose a novel video captioning framework, termed as Bidirectional Long-Short Term Memory (BiLSTM), which deeply captures bidirectional global temporal structure in video. Specifically, we first devise a joint visual modelling approach to encode video data by combining a forward LSTM pass, a backward LSTM pass, together with visual features from Convolutional Neural Networks (CNNs). Then, we inject the derived video representation into the subsequent language model for initialization. The benefits are in two folds: 1) comprehensively preserving sequential and visual information; and 2) adaptively learning dense visual features and sparse semantic representations for videos and sentences, respectively. We verify the effectiveness of our proposed video captioning framework on a commonly used benchmark, i.e., Microsoft Video Description (MSVD) corpus, and the experimental results demonstrate that the superiority of the proposed approach as compared to several state-of-the-art methods.

BidirectionalLong-Short Term Memory for Video Description

Journal

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

BidirectionalLong-Short Term Memory for Video Description

Journal

MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE

Publisher

ASSOC COMPUTING MACHINERY

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper