4.7 Article

Retrieving Object Motions From Coded Shutter Snapshot in Dark Environment

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 32, 期 -, 页码 3281-3294

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2023.3280010

关键词

Object detection; Imaging; Image reconstruction; Image coding; Proposals; Feature extraction; Photography; image capture; deep learning; video surveillance

向作者/读者索取更多资源

Video object detection has seen significant progress in recent decades, but existing detectors struggle with low lighting conditions and motion blur. To address this, we propose a method that multiplexes frame sequences into one snapshot and extracts motion cues for trajectory retrieval. By incorporating a programmable shutter and using a deep network called DECENT, we can effectively retrieve bounding boxes from blurred images of dynamic scenes. We generate quasi-real data for network learning, which allows for high generalization on real dark videos. This approach offers advantages of low bandwidth, low cost, compact setup, and high accuracy, and has been experimentally validated for night surveillance.
Video object detection is a widely studied topic and has made significant progress in the past decades. However, the feature extraction and calculations in existing video object detectors demand decent imaging quality and avoidance of severe motion blur. Under extremely dark scenarios, due to limited sensor sensitivity, we have to trade off signal-to-noise ratio for motion blur compensation or vice versa, and thus suffer from performance deterioration. To address this issue, we propose to temporally multiplex a frame sequence into one snapshot and extract the cues characterizing object motion for trajectory retrieval. For effective encoding, we build a prototype for encoded capture by mounting a highly compatible programmable shutter. Correspondingly, in terms of decoding, we design an end-to-end deep network called detection from coded snapshot (DECENT) to retrieve sequential bounding boxes from the coded blurry measurements of dynamic scenes. For effective network learning, we generate quasi-real data by incorporating physically-driven noise into the temporally coded imaging model, which circumvents the unavailability of training data and with high generalization ability on real dark videos. The approach offers multiple advantages, including low bandwidth, low cost, compact setup, and high accuracy. The effectiveness of the proposed approach is experimentally validated under low illumination vision and provide a feasible way for night surveillance.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据