4.6 Article

A Deep Learning-Based Efficient Firearms Monitoring Technique for Building Secure Smart Cities

Journal

IEEE ACCESS
Volume 11, Issue -, Pages 37515-37524

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2023.3266514

Keywords

Feature extraction; Deep learning; Smart cities; Law enforcement; Videos; Surveillance; Computer vision; deep learning; ensemble; firearms; smart cities

Ask authors/readers for more resources

Violence, especially involving firearms, is a shameful part of our society and results in the loss of innocent lives. This paper focuses on utilizing deep learning techniques to detect guns and human faces, providing law enforcement with quick intelligence and preventive measures. By employing various detection techniques and ensemble schemes, it achieves the best performance in identifying firearms and human faces, with promising applicability in social media content detection. The rigorous testing and comparative results demonstrate the effectiveness and reliability of the proposed model.
Violence, in any form, is a disgrace to our civilized world. Nevertheless, even in modern times, violence is an integral part of our society and causes the deaths of many innocent lives. One of the conventional means of violence is using a firearm. Firearm-related deaths are currently a global phenomenon. It is a threat to society and a challenge to law enforcement agencies. A significant portion of such crimes happen in semi-urban areas or cities. Governments and private organizations use CCTV-based surveillance extensively today for prevention and monitoring. However, human-based monitoring requires a significant amount of person-hours as a resource and is prone to mistakes. On the other hand, automated smart surveillance for violent activities is more suitable for scale and reliability. The paper's main focus is to showcase that deep learning-based techniques can be used in combination to detect firearms (particularly guns). This paper uses different detection techniques, such as Faster Region-Based Convolutional Neural Networks (Faster RCNN) and the latest EfficientDet-based architectures for detecting guns and human faces. An ensemble (stacked) scheme has improved the detection performance to identify human faces and guns at the post-processing level using Non-Maximum Suppression, Non-Maximum Weighted, and Weighted Box Fusion techniques. This paper has empirically discussed the comparative results of various detection techniques and their ensembles. It helps the police gather quick intelligence about the incident and take preventive measures at the earliest. Also, the same technique can be used to identify social media videos for gun-based content detection. Here, the Weighted Box Fusion-based Ensemble Detection Scheme provides mean average precisions 77.02%, 16.40%, 29.73% for the mAP0.5, mAP0.75 and mAP[0.500.95], respectively. The results achieve the best performance among all the experimented alternatives. The model has been rigorously tested with unknown test images and movie clips. The obtained ensemble schemes are satisfactory and consistently improve over primary models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available