4.7 Article

SPADE-E2VID: Spatially-Adaptive Denormalization for Event-Based Video Reconstruction

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING
Volume 30, Issue -, Pages 2488-2500

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2021.3052070

Keywords

Image reconstruction; Cameras; Training; Image resolution; Task analysis; Optical losses; Brightness; Image reconstruction; event camera; sparse image

Funding

  1. National Natural Science Foundation of China [U1764264/61873165]
  2. Shanghai Automotive Industry Science and Technology Development Foundation [1733/1807]

Ask authors/readers for more resources

Event-based cameras offer advantages over traditional cameras, but utilizing the data they produce is challenging due to the unique nature of event sensors. Neural networks have led to significant advances in event-based image reconstruction, with the new SPA DE-E2VID model showing improved video quality. The model also features faster training time and allows reconstruction without a temporal loss function, demonstrating promising results for event camera technology.
Event-based cameras have several advantages over traditional cameras that shoot videos in frames. Event cameras have a high temporal resolution, high dynamic range, and almost non-existence of blurriness. The data that is produced by event sensors forms a chain of events when a change in brightness is reported in each pixel. This feature makes it difficult to directly apply existing algorithms and take advantage of the event camera data. Due to the developments in neural networks, important advances were made in event-based image reconstruction. Even though these neural networks achieve precise reconstructions while preserving most of the properties of the event cameras, there is still an initialization time that needs to have the highest possible quality in the reconstructed frames. In this work, we present the SPADE-E2VID neural network model that improves the quality of early frames in an event-based reconstructed video, as well as the overall contrast. The SPADE-E2VID model improves the quality of the first reconstructed frames by 15.87% for MSE error, 4.15% for SSIM, and 2.5% in LPIPS. In addition, the SPADE layer in our model allows training our model to reconstruct videos without a temporal loss function. Another advantage of our model is that it has a faster training time. In a many-to-one training style, we avoid running the loss function at each step, executing the loss function at the end of each loop only once. In the present work, we also carried out experiments with event cameras that do not have polarity data. Our model produces quality video reconstructions with non-polarity events in HD resolution (1200 x 800). The Video, the code, and the datasets will be available at: https://github.com/RodrigoGantier/SPADE_E2VID.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available