☆ 4.7 Article

Embedding Perspective Analysis Into Multi-Column Convolutional Neural Network for Crowd Counting

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

卷 30, 期 -, 页码 1395-1407

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2020.3043122

关键词

Convolution; Estimation; Transforms; Kernel; Training; Standards; Smoothing methods; Crowd counting; multi-column network; perspective analysis; transform dilated convolution

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

Italy-China Collaboration Project TALENT [2018YFE0118400]
National Natural Science Foundation of China [61620106009, 61772494, 61931008, U1636214, 61836002, 61976069]
Key Research Program of Frontier Sciences, CAS [QYZDJ-SSW-SYS013]
Youth Innovation Promotion Association CAS
Fundamental Research Funds for Central Universities

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The study presents a simple yet effective multi-column network that integrates perspective analysis method with counting network, enabling efficient analysis of crowd quantities in scenes of various scales. By sharing parameters and introducing strategies like transform dilated convolution, the evaluation accuracy is improved, achieving state-of-the-art performance on multiple datasets.

The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales. Furthermore, to improve the evaluation accuracy of the column with a large receptive field, we propose a transform dilated convolution. The transform dilated convolution breaks the fixed sampling structure of the deep network. Moreover, it needs no extra parameters and training, and the offsets are constrained in a local region, which is designed for the congested scenes. The proposed method achieves state-of-the-art performance on five datasets (ShanghaiTech, UCF CC 50, WorldEXPO'10, UCSD, and TRANCOS).

Embedding Perspective Analysis Into Multi-Column Convolutional Neural Network for Crowd Counting

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Embedding Perspective Analysis Into Multi-Column Convolutional Neural Network for Crowd Counting

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文