期刊
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
卷 -, 期 -, 页码 -出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TII.2023.3298476
关键词
Automatic video anomaly detection system; intelligent video surveillance; normality learning; spatial-temporal prototype features; unsupervised learning
This study proposes an appearance-motion prototype network (AMP-net) for detecting anomalous events in surveillance videos. By utilizing external memories to record prototype features and introducing temporal attention to enhance the representation of dynamics, the proposed method achieves a delicate balance of effective representation of normal events and accurate detection of anomalies. Experimental results demonstrate that AMP-net achieves performance comparable to state-of-the-art methods on multiple benchmark datasets.
As essential tools for industry safety protection, automatic video anomaly detection systems (AVADS) are designed to detect anomalous events of concern in surveillance videos. Existing VAD methods lack effective exploration of the prototypical appearance and motion features leading to poor performance in realistic scenarios. Specifically, they either misreport regular events as anomalies due to insufficient representation power, or lead to missed detections with over-power generalization. In this regard, we propose an appearance-motion prototype network (AMP-net) that uses external memories to record prototype features and augments the appearance-motion prototype with a spatial-temporal fusion. In addition, AMP-net sequentially fuses appearance features from deep to shallow to utilize multiscale spatial context. Additionally, we introduce temporal attention to capture important dynamics and enhance AMP-net for representing regular motion. The proposed method achieves a delicate balance of effective representation of normal events and limited generalization to anomalies. Experiments on three benchmark datasets demonstrate that our method can accurately detect anomalous events, achieving performance comparable to state-of-the-art methods with frame-level AUCs of 98.7%, 92.4%, and 78.8% on the UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets. Moreover, we conducted a case study on the self-collected industrial dataset, and the results indicate that our AMP-net can cope with complex industrial scenarios and outperform existing methods.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据