☆ 4.7 Article

RESTEP Into the Future: Relational Spatio-Temporal Learning for Multi-Person Action Forecasting

IEEE TRANSACTIONS ON MULTIMEDIA (2023)

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

卷 25, 期 -, 页码 1954-1963

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2021.3088303

关键词

Multi-person action forecasting; spatiotemporal dependencies; graph neural network; weakly-supervised learning

类别

Computer Science, Information Systems Computer Science, Software Engineering Telecommunications

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Multi-person action forecasting is a crucial step in video understanding and this paper proposes a novel RElational Spatio-TEmPoral learning (RESTEP) approach to address this challenge. RESTEP combines spatial and temporal information in a single pass through relational reasoning, enabling the simultaneous prediction of actions for all actors in a scene. Experimental results demonstrate that RESTEP outperforms existing methods on multiple datasets.

Multi-person action forecasting is an emerging topic in the computer vision field, and it is a pivotal step toward video understanding at a semantic level. This task is difficult due to the complexity of spatial and temporal dependencies. Yet, the state-of-the-art literature does not seem to be adequately responsive to this challenge. Hence, how to better foresee the forthcoming actions per actor has to be further pursued. Toward this end, we put forth a novel RElational Spatio-TEmPoral learning approach (RESTEP) for multi-person action forecasting. Our RESTEP explores the key that inherently characterizes actions from a perspective of incorporating the spatial and temporal information in a single pass (spatio-temporal dependencies) by extending relational reasoning. As a result, the RESTEP enables simultaneously predicting the actions of all actors in the scene. Our proposal significantly differs from mainstream works that heavily rely on independently processing the spatial and temporal dependencies. The proposed RESTEP first perceives a graph building upon the historical observations, then reasons the relational spatio-temporal context to extrapolate future actions. In order to augment the comprehension of individual actions that might vary over time, we further delve deeper into the essence behind this point - the evolution of spatio-temporal dependencies via optimizing the corresponding mutual information. We assess the RESTEP method on the large-scale Atomic Visual Actions (AVA) dataset, Activities in Extended Videos (ActEV/VIRAT) dataset and Joint-annotated Human Motion Data Base (J-HMDB). The experimental outcomes reveal that RESTEP can introduce considerable improvements with respect to recent leading studies.

RESTEP Into the Future: Relational Spatio-Temporal Learning for Multi-Person Action Forecasting

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

RESTEP Into the Future: Relational Spatio-Temporal Learning for Multi-Person Action Forecasting

期刊

IEEE TRANSACTIONS ON MULTIMEDIA

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文