4.8 Article

SuperFast: 200x Video Frame Interpolation via Event Camera

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TPAMI.2022.3224051

关键词

Cameras; Streaming media; Brightness; Interpolation; Task analysis; Lenses; Visualization; Video frame interpolation; event-enhanced; high-speed scenarios; high-speed VFI dataset

向作者/读者索取更多资源

This paper proposes a Fast-Slow joint synthesis framework, named SuperFast, for event-enhanced high-speed video frame interpolation. It divides the task into two sub-tasks, one for high-speed motion contents and the other for relatively slow-motion contents, and utilizes a fusion module to generate the final video frame interpolation results. Experimental results show that the proposed framework achieves state-of-the-art 200x video frame interpolation performance under high-speed motion scenarios.
Traditional frame-based video frame interpolation (VFI) methods rely on the linear motion assumption and brightness invariance assumption, which may lead to fatal errors confronting the scenarios with high-speed motions. To tackle the above challenge, inspired by the advantages of event cameras on asynchronously recording brightness changes at each pixel, we propose a Fast-Slow joint synthesis framework for event-enhanced high-speed video frame interpolation, named SuperFast, in this paper, which can generate high frame rate (5000 FPS, 200x faster) video from the input low frame rate (25 FPS) video and the corresponding event stream. In our framework, the task is divided into two sub-tasks, i.e., video frame interpolation for the contents with and without highspeed motions, which are tackled by two corresponding branches, i.e., the fast synthesis pathway and the slow synthesis pathway. The fast synthesis pathway leverages a spiking neural network to encode the input event stream, and combines boundary frames to generate intermediate results through synthesis and refinement, targeting on contents with high-speed motions. The slow synthesis pathway stacks the two input boundary frames and the event stream to synthesize intermediate results, focusing on relatively slowmotion contents. Finally, a fusion module with a comparison loss is utilized to generate the final video frame interpolation results. We also build a hybrid visual acquisition system containing an event camera and a high frame rate camera, and collect the first 5000 FPS High-Speed Event-enhanced Video frame Interpolation (THUHSEVI) dataset. To evaluate the performance of our proposed framework, we have conducted experiments on our THUHSEVI dataset and the existing HS-ERGB dataset. Experimental results demonstrate that our proposed framework can achieve state-of-the-art 200x video frame interpolation performance under high-speed motion scenarios.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据