☆ 4.6 Article

Context-Unsupervised Adversarial Network for Video Sensors

SENSORS (2022)

期刊

SENSORS

卷 22, 期 9, 页码 -

出版社

MDPI

DOI: 10.3390/s22093171

关键词

background subtraction; adversarial networks; deep learning; computer vision; video sensors

类别

Chemistry, Analytical Engineering, Electrical & Electronic Instruments & Instrumentation

资金

Spanish Research Agency (AEI) [PID2020-116907RB-I00]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Foreground object segmentation is a crucial step for surveillance systems based on video sensor networks. Existing methods either use statistical background modeling or convolutional neural networks (CNNs), but the latter usually requires specific training for each scene. This study proposes a method that does not require scene-specific training by using statistical techniques to generate a rough mask and refining it using a network. The results obtained demonstrate improved performance compared to non-CNN methods and are among the best for context-unsupervised CNN systems.

Foreground object segmentation is a crucial first step for surveillance systems based on networks of video sensors. This problem in the context of dynamic scenes has been widely explored in the last two decades, but it still has open research questions due to challenges such as strong shadows, background clutter and illumination changes. After years of solid work based on statistical background pixel modeling, most current proposals use convolutional neural networks (CNNs) either to model the background or to make the foreground/background decision. Although these new techniques achieve outstanding results, they usually require specific training for each scene, which is unfeasible if we aim at designing software for embedded video systems and smart cameras. Our approach to the problem does not require specific context or scene training, and thus no manual labeling. We propose a network for a refinement step on top of conventional state-of-the-art background subtraction systems. By using a statistical technique to produce a rough mask, we do not need to train the network for each scene. The proposed method can take advantage of the specificity of the classic techniques, while obtaining the highly accurate segmentation that a deep learning system provides. We also show the advantage of using an adversarial network to improve the generalization ability of the network and produce more consistent results than an equivalent non-adversarial network. The results provided were obtained by training the network on a common database, without fine-tuning for specific scenes. Experiments on the unseen part of the CDNet database provided 0.82 a F-score, and 0.87 was achieved for LASIESTA databases, which is a database unrelated to the training one. On this last database, the results outperformed by 8.75% those available in the official table. The results achieved for CDNet are well above those of the methods not based on CNNs, and according to the literature, among the best for the context-unsupervised CNNs systems.

Context-Unsupervised Adversarial Network for Video Sensors

期刊

SENSORS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Context-Unsupervised Adversarial Network for Video Sensors

期刊

SENSORS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文