☆ 4.6 Article

Multi-scale dilated convolution of feature Fusion Network for Crowd counting

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

卷 81, 期 26, 页码 37939-37952

出版社

SPRINGER

DOI: 10.1007/s11042-022-13130-5

关键词

Crowd counting; Convolution neural network; Dilated convolution; Feature fusion

类别

Computer Science, Information Systems Computer Science, Software Engineering Computer Science, Theory & Methods Engineering, Electrical & Electronic

资金

Natural Science Foundation of Shandong Province [ZR2019MF050]
Shandong Province colleges and universities youth innovation technology plan innovation team project [2020KJN011]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a crowd counting model based on a convolutional neural network to address the problem of head size variability. By fusing multi-scale high-level features, this method achieves better performance in the task of crowd counting. Experimental results on two public datasets demonstrate the effectiveness of the proposed approach.

Crowd counting has long been a challenging task due to the perspective distortion and variability in head size. The previous methods ignore the multi-scale information in images or simply use convolutions with different kernel sizes to extract multi-scale features, resulting in incomplete multi-scale features extracted. In this paper, we propose a crowd counting model called Multi-scale Dilated Convolution of Feature Fusion Network (MsDFNet) based on a CNN (convolutional neural network). Our MsDFNet is based on the regression method of the density map. The density map is predicted by the parameters learned by CNN to obtain better prediction results. The proposed network mainly includes three components, a CNN to extract low-level features, a multi-scale dilated convolution module and multi-column feature fusion blocks, a density map regression module. Multi-scale dilated convolutions are employed to extract multi-scale high-level features, and the features extracted from different columns are fused. The combination of the multi-scale dilated convolution module and the multi-column feature fusion block can effectively extract more complete multi-scale features and boost the performance of counting small-sized targets. Experiments show that the problem of various head sizes in images can be effectively solved by fusing multi-scale context feature information. We prove the effectiveness of our method on two public datasets (The ShanghaiTech dataset and the UCF_CC_50 dataset). We compare our method with the previous state-of-the-art crowd counting algorithms in terms of MAE (Mean Absolute Error) and MSE (Mean Square Error) and significantly improves the performance, especially in case of various head sizes. On the UCF_CC_50 dataset, our method reduces the MAE index by 28.6 compared with the previous state-of-the-art method. (The lower the MAE, the better the performance).

Multi-scale dilated convolution of feature Fusion Network for Crowd counting

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Multi-scale dilated convolution of feature Fusion Network for Crowd counting

期刊

MULTIMEDIA TOOLS AND APPLICATIONS

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文