期刊
IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 24, 期 11, 页码 3781-3795出版社
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2015.2456412
关键词
Human action recognition; trajectory; motion; representation; reference points; camera motion
资金
- National 863 Program of China [2014AA015101]
- National Science Foundation of China [61201387]
- Science and Technology Commission of Shanghai Municipality, China [13PJ1400400]
- EU FP7 QUICK Project [PIRSES-GA-2013-612652]
Human action recognition in unconstrained videos is a challenging problem with many applications. Most state-of-the-art approaches adopted the well-known bag-of-features representations, generated based on isolated local patches or patch trajectories, where motion patterns, such as object-object and object-background relationships are mostly discarded. In this paper, we propose a simple representation aiming at modeling these motion relationships. We adopt global and local reference points to explicitly characterize motion information, so that the final representation is more robust to camera movements, which widely exist in unconstrained videos. Our approach operates on the top of visual codewords generated on dense local patch trajectories, and therefore, does not require foreground-background separation, which is normally a critical and difficult step in modeling object relationships. Through an extensive set of experimental evaluations, we show that the proposed representation produces a very competitive performance on several challenging benchmark data sets. Further combining it with the standard bag-of-features or Fisher vector representations can lead to substantial improvements.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据