☆ 4.7 Article

Multi-Scale Structure-Aware Network for Weakly Supervised Temporal Action Detection

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Volume 30, Issue -, Pages 5848-5861

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TIP.2021.3089361

Keywords

Proposals; Feature extraction; Image segmentation; Scalability; Noise measurement; Graph neural networks; GSM; Weakly supervised; action detection; multi-scale; structure-aware

Funding

National Nature Science Foundation of China [62022078, 62021001]
Open Project Program of the National Laboratory of Pattern Recognition (NLPR) [202000019]
Youth Innovation Promotion Association CAS [2018166]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposed an end-to-end Multi-Scale Structure-Aware Network (MSA-Net) for weakly supervised temporal action detection, which explores both the global and local structure information to effectively learn discriminative structure aware representations for robust and complete action detection. Extensive experimental results on two benchmark datasets demonstrate that MSA-Net outperforms state-of-the-art methods.

Weakly supervised temporal action detection has better scalability and practicability than fully supervised action detection in reality deployment. However, it is difficult to learn a robust model without temporal action boundary annotations. In this paper, we propose an en-to-end Multi-Scale Structure-Aware Network (MSA-Net) for weakly supervised temporal action detection by exploring both the global structure information of a video and the local structure information of actions. The proposed SA-Net enjoys several merits. First, to localize actions with different durations, each video is encoded into feature representations with different temporal scales. Second, based on the multi-scale feature representation, the proposed model has designed two effective structure modeling mechanisms including global structure modeling and local structure modeling, which can effectively learn discriminative structure aware representations for robust and complete action detection. To the best of our knowledge, this is the first work to fully explore the global and local structure information in a unified deep model for weakly supervised action detection. And extensive experimental results on two benchmark datasets demonstrate that the proposed MSA-Net performs favorably against state-of-the-art methods.

Multi-Scale Structure-Aware Network for Weakly Supervised Temporal Action Detection

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Multi-Scale Structure-Aware Network for Weakly Supervised Temporal Action Detection

Journal

IEEE TRANSACTIONS ON IMAGE PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper