Journal
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
Volume -, Issue -, Pages -Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2022.3175480
Keywords
Task analysis; Activity recognition; Solid modeling; Computational modeling; Computer architecture; Object oriented modeling; Learning systems; Activity recognition and anticipation; multiscale behavior modeling; multitask learning; two-stream network fusion
Categories
Funding
- Amazon AWS through the Oxford-Singapore Human-Machine Collaboration Program
Ask authors/readers for more resources
This study proposes a two-stream multiscale human activity recognition and anticipation network, which is optimized using multitask learning and temporal-channel attention fusion approach to enhance the model's representation ability for both temporal and spatial features.
Deep convolutional neural networks have been leveraged to achieve huge improvements in video understanding and human activity recognition performance in the past decade. However, most existing methods focus on activities that have similar time scales, leaving the task of action recognition on multiscale human behaviors less explored. In this study, a two-stream multiscale human activity recognition and anticipation (MS-HARA) network is proposed, which is jointly optimized using a multitask learning method. The MS-HARA network fuses the two streams of the network using an efficient temporal-channel attention (TCA)-based fusion approach to improve the model's representational ability for both temporal and spatial features. We investigate the multiscale human activities from two basic categories, namely, midterm activities and long-term activities. The network is designed to function as part of a real-time processing framework to support interaction and mutual understanding between humans and intelligent machines. It achieves state-of-the-art results on several datasets for different tasks and different application domains. The midterm and long-term action recognition and anticipation performance, as well as the network fusion, are extensively tested to show the efficiency of the proposed network. The results show that the MS-HARA network can easily be extended to different application domains.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available