4.7 Article

Fine-Grained Crowd Counting

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 30, 期 -, 页码 2114-2126

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2021.3049938

关键词

Task analysis; Estimation; Feature extraction; Image segmentation; Context modeling; Adaptation models; Surveillance; Crowd counting; fine-grained crowd counting

资金

  1. Research Grants Council of the Hong Kong Special Administrative Region, China [T32-101/15-R, CityU 11212518]

向作者/读者索取更多资源

This article introduces a fine-grained crowd counting method, which categorizes crowds and counts the number of individuals in each category, suitable for practical applications. By constructing a new fine-grained counting dataset, proposing a two branch architecture, and two optimization strategies, the algorithm's prediction accuracy is effectively improved.
Current crowd counting algorithms are only concerned about the number of people in an image, which lacks low-level fine-grained information of the crowd. For many practical applications, the total number of people in an image is not as useful as the number of people in each sub-category. For example, knowing the number of people waiting inline or browsing can help retail stores; knowing the number of people standing/sitting can help restaurants/cafeterias; knowing the number of violent/non-violent people can help police in crowd management. In this article, we propose fine-grained crowd counting, which differentiates a crowd into categories based on the low-level behavior attributes of the individuals (e.g. standing/sitting or violent behavior) and then counts the number of people in each category. To enable research in this area, we construct a new dataset of four real-world fine-grained counting tasks: traveling direction on a sidewalk, standing or sitting, waiting in line or not, and exhibiting violent behavior or not. Since the appearance features of different crowd categories are similar, the challenge of fine-grained crowd counting is to effectively utilize contextual information to distinguish between categories. We propose a two branch architecture, consisting of a density map estimation branch and a semantic segmentation branch. We propose two refinement strategies for improving the predictions of the two branches. First, to encode contextual information, we propose feature propagation guided by the density map prediction, which eliminates the effect of background features during propagation. Second, we propose a complementary attention model to share information between the two branches. Experiment results confirm the effectiveness of our method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据