☆ 4.7 Article

Pixel-Wise Crowd Understanding via Synthetic Data

INTERNATIONAL JOURNAL OF COMPUTER VISION (2021)

期刊

INTERNATIONAL JOURNAL OF COMPUTER VISION

卷 129, 期 1, 页码 -

出版社

SPRINGER

DOI: 10.1007/s11263-020-01365-4

关键词

Crowd analysis; Pixel-wise understanding; Crowd counting; Crowd segmentation; Synthetic data generation

类别

Computer Science, Artificial Intelligence

资金

National Key R&D Program of China [2017YFB1002202]
National Natural Science Foundation of China [U1864204, 61773316, 61632018, 61825603]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper explores crowd analysis using computer vision techniques, focusing on pixel-wise crowd understanding. By developing a synthetic dataset called GCC Dataset and proposing two methods to improve crowd understanding, the study aims to achieve better performance in real-world scenarios.

Crowd analysis via computer vision techniques is an important topic in the field of video surveillance, which has wide-spread applications including crowd monitoring, public safety, space design and so on. Pixel-wise crowd understanding is the most fundamental task in crowd analysis because of its finer results for video sequences or still images than other analysis tasks. Unfortunately, pixel-level understanding needs a large amount of labeled training data. Annotating them is an expensive work, which causes that current crowd datasets are small. As a result, most algorithms suffer from over-fitting to varying degrees. In this paper, take crowd counting and segmentation as examples from the pixel-wise crowd understanding, we attempt to remedy these problems from two aspects, namely data and methodology. Firstly, we develop a free data collector and labeler to generate synthetic and labeled crowd scenes in a computer game, Grand Theft Auto V. Then we use it to construct a large-scale, diverse synthetic crowd dataset, which is named as GCC Dataset. Secondly, we propose two simple methods to improve the performance of crowd understanding via exploiting the synthetic data. To be specific, (1) supervised crowd understanding: pre-train a crowd analysis model on the synthetic data, then fine-tune it using the real data and labels, which makes the model perform better on the real world; (2) crowd understanding via domain adaptation: translate the synthetic data to photo-realistic images, then train the model on translated data and labels. As a result, the trained model works well in real crowd scenes.Extensive experiments verify that the supervision algorithm outperforms the state-of-the-art performance on four real datasets: UCF_CC_50, UCF-QNRF, and Shanghai Tech Part A/B Dataset. The above results show the effectiveness, values of synthetic GCC for the pixel-wise crowd understanding. The tools of collecting/labeling data, the proposed synthetic dataset and the source code for counting models are available at.

Pixel-Wise Crowd Understanding via Synthetic Data

期刊

INTERNATIONAL JOURNAL OF COMPUTER VISION

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Pixel-Wise Crowd Understanding via Synthetic Data

期刊

INTERNATIONAL JOURNAL OF COMPUTER VISION

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文