☆ 4.7 Article

Video Frame Prediction by Deep Multi-Branch Mask Network

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

卷 31, 期 4, 页码 1283-1295

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSVT.2020.2984783

关键词

Optical distortion; Synthesizers; Adaptive optics; Optical imaging; Predictive models; Optical network units; Optical computing; Video frame prediction; deep learning; multi-branch mask network; multi-frame prediction; video anomaly detection

类别

Engineering, Electrical & Electronic

资金

National Natural Science Foundation of China [61751308, 61773311, 61603057]
Fundamental Research Funds for the Central Universities [300102320202]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Future frame prediction in video is a crucial problem in computer vision, facing challenges in modeling complex scene evolution and temporal-spatial correlation. This paper proposes a deep multi-branch mask network (DMMNet) that successfully combines optical flow warping and RGB pixel synthesizing methods, offering a more flexible masking network for video frame prediction.

Future frame prediction in video is one of the most important problem in computer vision, and useful for a range of practical applications, such as intention prediction or video anomaly detection. However, this task is challenging because of the complex and dynamic evolution of scene. The difficulty of video frame prediction is to model the inherent spatio-temporal correlation between frames and pose an adaptive and flexible framework for large motion change or appearance variation. In this paper, we construct a deep multi-branch mask network (DMMNet) which adaptively fuses the advantages of optical flow warping and RGB pixel synthesizing methods, i.e., the common two kinds of approaches in this task. In the procedure of DMMNet, we add mask layer in each branch to adaptively adjust the magnitude range of estimated optical flow and the weight of predicted frames by optical flow warping and RGB pixel synthesizing, respectively. In other words, we provide a more flexible masking network for motion and appearance fusion on video frame prediction. Exhaustive experiments on Caltech pedestrian and UCF101 datasets show that the proposed model can obtain favorable video frame prediction performance compared with the state-of-the-art methods. In addition, we also put our model into the video anomaly detection problem, and the superiority is verified by the experiments on UCSD dataset.

Video Frame Prediction by Deep Multi-Branch Mask Network

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Video Frame Prediction by Deep Multi-Branch Mask Network

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文