☆ 3.8 Proceedings Paper

An End-to-End Transformer Model for Crowd Localization

COMPUTER VISION - ECCV 2022, PT I (2022)

期刊

COMPUTER VISION - ECCV 2022, PT I

卷 13661, 期 -, 页码 38-54

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

DOI: 10.1007/978-3-031-19769-7_3

关键词

Crowd localization; Crowd counting; Transformer

类别

Computer Science, Artificial Intelligence Imaging Science & Photographic Technology

资金

National Key R&D Program of China [2018YFB1004602]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this paper, an elegant and end-to-end Crowd Localization Transformer (CLTR) is proposed to solve the task of crowd localization. The proposed method treats crowd localization as a direct set prediction problem and introduces a KMO-based Hungarian matcher to reduce ambiguous points and generate more reasonable matching results. Extensive experiments demonstrate the effectiveness of the proposed method.

Crowd localization, predicting head positions, is a more practical and high-level task than simply counting. Existing methods employ pseudo-bounding boxes or pre-designed localization maps, relying on complex post-processing to obtain the head positions. In this paper, we propose an elegant, end-to-end Crowd Localization TRansformer named CLTR that solves the task in the regression-based paradigm. The proposed method views the crowd localization as a direct set prediction problem, taking extracted features and trainable embeddings as input of the transformer-decoder. To reduce the ambiguous points and generate more reasonable matching results, we introduce a KMO-based Hungarian matcher, which adopts the nearby context as the auxiliary matching cost. Extensive experiments conducted on five datasets in various data settings show the effectiveness of our method. In particular, the proposed method achieves the best localization performance on the NWPU-Crowd, UCF-QNRF, and ShanghaiTech Part A datasets.

An End-to-End Transformer Model for Crowd Localization

期刊

COMPUTER VISION - ECCV 2022, PT I

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

An End-to-End Transformer Model for Crowd Localization

期刊

COMPUTER VISION - ECCV 2022, PT I

出版社

SPRINGER INTERNATIONAL PUBLISHING AG

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文