4.7 Article

Crowd Counting via Segmentation Guided Attention Networks and Curriculum Loss

Journal

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS
Volume 23, Issue 9, Pages 15233-15243

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TITS.2021.3138896

Keywords

Training; Convolutional neural networks; Image segmentation; Estimation; Neural networks; Kernel; Task analysis; Crowd counting; curriculum loss; Inception-v3; segmentation guided attention networks

Ask authors/readers for more resources

Automatic crowd behaviour analysis is a crucial task for intelligent transportation systems, and crowd counting plays a key role in this analysis. Recent advancements in deep convolutional neural networks have led to significant progress in crowd counting. This paper evaluates the performance of the baseline Inception-v3 model on commonly used crowd counting datasets, achieving surprisingly good results comparable to or better than existing models. Furthermore, a novel Segmentation Guided Attention Network (SGANet) with Inception-v3 as the backbone and a curriculum loss is proposed, which outperforms prior arts, attaining state-of-the-art performance on multiple datasets.
Automatic crowd behaviour analysis is an important task for intelligent transportation systems to enable effective flow control and dynamic route planning for varying road participants. Crowd counting is one of the keys to automatic crowd behaviour analysis. Crowd counting using deep convolutional neural networks (CNN) has achieved encouraging progress in recent years. Researchers have devoted much effort to the design of variant CNN architectures and most of them are based on the pre-trained VGG16 model. Due to the insufficient expressive capacity, the backbone network of VGG16 is usually followed by another cumbersome network specially designed for good counting performance. Although VGG models have been outperformed by Inception models in image classification tasks, the existing crowd counting networks built with Inception modules still only have a small number of layers with basic types of Inception modules. To fill in this gap, in this paper, we firstly benchmark the baseline Inception-v3 model on commonly used crowd counting datasets and achieve surprisingly good performance comparable with or better than most existing crowd counting models. Subsequently, we push the boundary of this disruptive work further by proposing a Segmentation Guided Attention Network (SGANet) with Inception-v3 as the backbone and a novel curriculum loss for crowd counting. We conduct thorough experiments to compare the performance of our SGANet with prior arts and the proposed model can achieve state-of-the-art performance with MAE of 57.6, 6.3 and 87.6 on ShanghaiTechA, ShanghaiTechB and UCF_QNRF, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available