4.7 Article

A Perspective-Embedded Scale-Selection Network for Crowd Counting in Public Transportation

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TITS.2023.3328000

关键词

Feature extraction; Convolution; Kernel; Fuses; Estimation; Decoding; Training; Crowd counting; multi-column network; perspective analysis; dilated convolution

向作者/读者索取更多资源

This paper proposes a novel perspective-embedded scale-selection multi-column network called PESSNet for crowd counting and high-quality density maps generation in congested urban transport systems. Experimental results demonstrate that PESSNet achieves reliable recognition performance and high robustness in different crowd counting tasks.
Crowd counting in congested urban transport systems is a highly challenging task for computer vision and deep learning due to several factors such as mutual occlusion, perspective change, and large-scale variations. In this paper, a novel perspective-embedded scale-selection multi-column network named PESSNet is proposed for crowd counting and high-quality density maps generation. The proposed method aligns the branches to various scales by leveraging different receptive fields, and utilizes perspective parameters to adjust the sensitivity of each branch to different perspective areas in the scene. Specifically, the PESSNet consists of four key components: 1) feature pyramid network (FPN) fuses multi-stage features extracted from the backbone network; 2) scale-selection dilated layer (SSDL) extracts features by using different dilated convolution kernels for each stage; 3) perspective-embedded fusion layer (PEFL) encodes the spatial perspective relationships across all feature levels into the network and provides a more effective fine-grained weight map; and 4) density maps generator (DMG) employs deconvolution layer as a decoder to generate high-quality density maps. The above strategies maximizes the ability of multi-column network to extract the features of instances with various scales. Extensive experiments on seven crowd counting benchmark datasets, JHU-CROWD, ShanghaiTech, UCF-QNRF, ShanghaiTechRGBD, WorldEXPO'10, TRANCOS, and NWPU-Crowd indicate that PESSNet achieves reliable recognition performance and high robustness in difference crowd counting.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据