4.7 Article

STGL: Spatial-Temporal Graph Representation and Learning for Visual Tracking

Journal

IEEE TRANSACTIONS ON MULTIMEDIA
Volume 23, Issue -, Pages 2162-2171

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TMM.2020.3008035

Keywords

Target tracking; Computational modeling; Visualization; Noise measurement; Semisupervised learning; Shape; Feature extraction; Visual tracking; semi-supervised learning; graph representation; graph learning

Funding

  1. National Natural Science Foundation of China [61602001, 61872005, 61671018]
  2. NSFC Key Projects of International (Regional) Cooperation and Exchanges [61860206004]
  3. Open fund for Discipline Construction, Institute of Physical Science and Information Technology, Anhui University
  4. Cooperative Research Project Program of Nanjing Artificial Intelligence Chip Research, Institute of Automation, Chinese Academy of Sciences

Ask authors/readers for more resources

The STGL model proposed in this paper aims to exploit both spatial and temporal structures of patches simultaneously in a unified graph representation and semi-supervised learning model. It naturally exploits the learned representation of the object in the previous frame, leading to more accurate and robust representation of the object in the current frame.
Tracking-by-detection framework has been normally adopted in visual tracking methods. It aims to localize the visual target object with a bounding box. However, the bounding box is usually difficult to describe the target object accurately and thus easily introduces noisy background information, which usually degrades the final tracking results. Recently, weighted patch representation of the object has been shown very effectively for suppressing the undesirable background information and thus can obviously improve the tracking results. In this paper, we propose a novel Spatial-Temporal Graph representation and Learning (STGL) model to generate a kind of robust target representation for visual tracking problem. The main aspect of STGL is that it aims to exploit both spatial (within each frame) and temporal (between consecutive frames) structure of patches simultaneously in a unified graph representation and semi-supervised learning model. Comparing with existing works, STGL naturally exploits the learned representation of object in previous frame and thus can obtain the representation of object in current frame more accurately and robustly. A new ADMM algorithm is derived to solve the proposed STGL model. Based on the proposed object representation, we then adapt the structured SVM by introducing scale estimation to achieve object tracking. Extensive experiments show that our method outperforms the state-of-the-art patch based tracking methods on two standard benchmark datasets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available