4.7 Article

Hybrid attention network based on progressive embedding scale-context for crowd counting

期刊

INFORMATION SCIENCES
卷 591, 期 -, 页码 306-318

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2022.01.046

关键词

Crowd counting; Hybrid attention; Progressive embedding scale-context; Density map estimation

资金

  1. National Natural Science Foundation of China [61971073]

向作者/读者索取更多资源

This paper proposes a hybrid attention network (HAN) that addresses background noise and head scale variation by incorporating progressive embedding scale context information. By building parallel spatial attention and channel attention modules and embedding scale-context in the network, the proposed method focuses on the human head area and reduces counting errors caused by perspective and head scale variation. A progressive learning strategy is introduced by cascading multiple hybrid attention modules to gradually integrate different scale-context information.
The existing crowd counting methods usually adopt attention mechanisms to tackle background noise, or apply multilevel features or multiscale context fusion to tackle scale variation. However, these approaches deal with these two problems separately. In this paper, we propose a hybrid attention network (HAN) by employing progressive embedding scale context (PES) information, which enables the network to simultaneously suppress noise and adapt head scale variation. We build the hybrid attention mechanism through two parallel spatial attention and channel attention modules, which makes the network focus more on the human head area and reduce the interference of background objects. In addition, we embed certain scale-context to the hybrid attention along the spatial and channel dimensions to alleviate the counting errors caused by the variation of perspective and head scale. Finally, we propose a progressive learning strategy through cascading multiple hybrid attention modules with embedding different scale contexts, which can gradually integrate different scale-context information into the current feature map from global to local. Ablation experiments show that the network architecture can gradually learn multi scale features and suppress background noise. Extensive experiments demonstrate that HANet obtains state-of-the-art counting performance on five mainstream datasets.(c) 2022 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据