4.7 Article

Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 23, Issue -, Pages 203-215

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2020.2984093

Keywords

Feature extraction; Anomaly detection; Trajectory; Three-dimensional displays; Hidden Markov models; Image reconstruction; Two dimensional displays; Anomaly detection; video surveillance; spatial-temporal cascade autoencoder; 3D gradient; optical flow; two-stream framework

Funding

  1. National Key Research and Development Program of China [2018YFB1305300]
  2. National Natural Science Foundation of China [61673244, 61703240]
  3. Key Research and Development Program of Shandong Province of China [2019JZZY010130, 2018CXGC0907]

Ask authors/readers for more resources

This paper proposes a cuboid-patch-based method for time-efficient anomaly detection and localization in video surveillance, using a spatial-temporal cascade autoencoder to make full use of spatial and temporal cues from video data. The method consists of two main stages defined by two neural networks: a spatial-temporal adversarial autoencoder (ST-AAE) and a spatial-temporal convolutional autoencoder (ST-CAE). Experimental results show that the framework outperforms other state-of-the-art works in this field.
Time-efficient anomaly detection and localization in video surveillance still remains challenging due to the complexity of anomaly. In this paper, we propose a cuboid-patch-based method characterized by a cascade of classifiers called a spatial-temporal cascade autoencoder (ST-CaAE), which makes full use of both spatial and temporal cues from video data. The ST-CaAE has two main stages, defined by two proposed neural networks: a spatial-temporal adversarial autoencoder (ST-AAE) and a spatial-temporal convolutional autoencoder (ST-CAE). First, the ST-AAE is used to preliminarily identify anomalous video cuboids and exclude normal cuboids. The key idea underlying ST-AAE is to obtain a Gaussian model to fit the distribution of the regular data. Then in the second stage, the ST-CAE classifies the specific abnormal patches in each anomalous cuboid with reconstruction error based strategy that takes advantage of the CAE and skip connection. A two-stream framework is utilized to fuse the appearance and motion cues to achieve more complete detection results, taking the gradient and optical flow cuboids as inputs for each stream. The proposed ST-CaAE is evaluated using three public datasets. The experimental results verify that our framework outperforms other state-of-the-art works.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available