4.7 Article

Image to Video Person Re-Identification by Learning Heterogeneous Dictionary Pair With Feature Projection Matrix

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIFS.2017.2765524

Keywords

Person re-identification; image to video person re-identification; heterogeneous dictionary pair learning; feature projection matrix; multi-view learning

Funding

  1. National Key Research and Development Program of China [2017YFB0202001]
  2. National Nature Science Foundation of China [61671182, 61672208, 61772220, 41571417, U1404618, 61272273, 61572375, 61233011, 91418202, 61472178, 61373038, 61672392]
  3. National Basic Research 973 Program of China [2014CB340702]
  4. Ministry of Science and Technology of China [2015BAK36B00, 2015BAK01B06]
  5. Key Science and Technology of Shenzhen [CXZZ20150814155434903]
  6. Key Program for International S&T Cooperation Projects of China [2016YFE0121200]
  7. Natural Science Foundation of Jiangsu Province [BK20170900]
  8. Scientific Research Staring Foundation for Introduced Talents in NJUPT under NUPTSF [NY217009]
  9. Science and Technology Program in Henan Province [1721102410064, 172102210186]
  10. Research Foundation of Henan University [2015YBZR024]

Ask authors/readers for more resources

Person re-identification plays an important role in video surveillance and forensics applications. In many cases, person re-identification needs to be conducted between image and video clip, e.g., re-identifying a suspect from large quantities of pedestrian videos given a single image of the suspect. We call re-identification in this scenario as image to video person re-identification (IVPR). In practice, image and video are usually represented with different features, and there usually exist large variations between frames within each video. These factors make matching between image and video become a very challenging task. In this paper, we propose a joint feature projection matrix and heterogeneous dictionary pair learning (PHDL) approach for IVPR. Specifically, the PHDL jointly learns an intra-video projection matrix and a pair of heterogeneous image and video dictionaries. With the learned projection matrix, the influence caused by the variations within each video on the matching can be reduced. With the learned dictionary pair, the heterogeneous image and video features can be transformed into coding coefficients with the same dimension, such that the matching can be conducted by using the coding coefficients. Furthermore, to ensure that the obtained coding coefficients own favorable discriminability, the PHDL designs a point-to-set coefficient discriminant term. To make better use of the complementary spatial-temporal and visual appearance information contained in pedestrian video data, we further propose a multi-view PHDL approach, which can fuse different video information effectively in the dictionary learning process. Experiments on four publicly available person sequence data sets demonstrate the effectiveness of the proposed approaches.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available