☆ 4.7 Article

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition

IEEE TRANSACTIONS ON MULTIMEDIA (2022)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 24, 期 -, 页码 2273-2286

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2021.3078882

关键词

Videos; Feature extraction; Visualization; Task analysis; Computational modeling; Target recognition; Prototypes; Egocentric videos; exocentric videos; holographic feature; multi-domain; action recognition

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

资金

National Key Research and Development Program of China [2018AAA0100604]
National Natural Science Foundation of China [61720106006, 61721004, 62072455, U1836220, U1705262, 61872424]
Key Research Program of Frontier Sciences of CAS [QYZDJ-SSW-JSC039]
Beijing Natural Science Foundation [L201001]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a method to solve the multi-domain action recognition task of egocentric-exocentric videos by transferring knowledge between the two domains to learn a single model. It maps videos to a global feature space and combines view-invariant and view-specific visual knowledge.

Though existing cross-domain action recognition methods successfully improve the performance on videos of one view (e.g., egocentric videos) by transferring the knowledge from videos of another view (e.g., exocentric videos), they have limitations in generality because the source and target domains need to be fixed aforehand. In this paper, we propose to solve a more practical task of multi-domain action recognition on egocentric-exocentric videos, which aims to learn a single model to recognize test videos from either egocentric perspective or exocentric perspective by transferring knowledge between two domains. Though previous cross-domain methods can also transfer knowledge from one domain to another one by learning view-invariant representations of two video domains, they are not suitable for the multi-domain action recognition task because they always suffer from the problem of losing view-specific visual information. As a solution to the multi-domain action recognition task, we propose to map a video from either egocentric perspective or exocentric perspective to a global feature space (we call it holographic feature space) that shares both view-invariant and view-specific visual knowledge of two views. Specially, we decompose the video feature into view-invariant component and view-specific component, where view-specific component is written into memory networks for saving view-specific visual knowledge. The final holographic feature combines view-invariant feature and view-specific features of two views based on the memory networks. We demonstrate the effectiveness of the proposed method with extensive experimental results on two public datasets. Moreover, the good performances under the semi-supervised setting show the generality of our model.

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文