☆ 4.7 Article

End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2021)

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

卷 31, 期 1, 页码 275-288

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TCSVT.2020.2975842

关键词

Target tracking; Machine learning; Recurrent neural networks; Optimization; Task analysis; Standards; Inference algorithms; Multi-object tracking; end-to-end deep learning; conditional random field; data association

类别

Engineering, Electrical & Electronic

资金

National Natural Science Foundation of China [61671484, 61701548, 61906119]
Hubei Natural Science Foundation [2018CFB503]
Fundamental Research Funds for the Central Universities
South-Central University for Nationalities [CZT20003]
Shanghai Pujiang Program

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes utilizing deep conditional random field networks to address the assignment problem in multi-object tracking, optimizing unary and pairwise potentials jointly in an end-to-end learning process. Extensive experiments show that this approach outperforms existing methods on MOT datasets.

By bundling multiple complex sub-problems into a unified framework, end-to-end deep learning frameworks reduce the need for hand engineering or tuning of parameters for each component, and optimize different modules jointly to ensure the generalization of the whole deep architecture. Despite tremendous success in numerous computer vision tasks, end-to-end learnings for multi-object tracking (MOT), especially for the assignment problem in data association, have been surprisingly less investigated mainly due to the lack of available training data. Furthermore, it is challenging to discriminate target objects under mutual occlusions or to reduce identity switches in crowded scenes. To tackle these challenges, this paper proposes learning deep conditional random field (CRF) networks, aiming to model the assignment costs as unary potentials and the long-term dependencies among detection results as pairwise potentials. Specifically, we use a bidirectional long short-term memory (LSTM) network to encode the long-term dependencies. We pose the CRF inference as a recurrent neural network learning process using the standard gradient descent algorithm, where unary and pairwise potentials are jointly optimized in an end-to-end manner. Extensive experiments are conducted on the challenging MOT datasets including MOT15, MOT16 and MOT17, and the results show that the proposed algorithm performs favorably against the state-of-the-art methods.

End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

End-to-End Learning Deep CRF Models for Multi-Object Tracking Deep CRF Models

期刊

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文