☆ 4.7 Article

AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Volume 34, Issue 4, Pages 1852-1863

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2019.2962815

Keywords

Videos; Feature extraction; Knowledge transfer; Task analysis; Adaptation models; Data models; Neural networks; Action recognition; encoder-decoder; knowledge transfer; point process; temporal action localization

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This article proposes a novel adaptability decomposing encoder-decoder network to transfer reliable knowledge between trimmed and untrimmed videos for action recognition and localization. By decomposing the original features into domain-adaptable and domain-specific ones, trim- untrimmed knowledge transfer can be safely confined within a more coherent subspace.

The point process is a solid framework to model sequential data, such as videos, by exploring the underlying relevance. As a challenging problem for high-level video understanding, weakly supervised action recognition and localization in untrimmed videos have attracted intensive research attention. Knowledge transfer by leveraging the publicly available trimmed videos as external guidance is a promising attempt to make up for the coarse-grained video-level annotation and improve the generalization performance. However, unconstrained knowledge transfer may bring about irrelevant noise and jeopardize the learning model. This article proposes a novel adaptability decomposing encoder-decoder network to transfer reliable knowledge between the trimmed and untrimmed videos for action recognition and localization by bidirectional point process modeling, given only video-level annotations. By decomposing the original features into the domain-adaptable and domain-specific ones based on their adaptability, trimmed-untrimmed knowledge transfer can be safely confined within a more coherent subspace. An encoder-decoder-based structure is carefully designed and jointly optimized to facilitate effective action classification and temporal localization. Extensive experiments are conducted on two benchmark data sets (i.e., THUMOS14 and ActivityNet1.3), and the experimental results clearly corroborate the efficacy of our method.

AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

AdapNet: Adaptability Decomposing Encoder-Decoder Network for Weakly Supervised Action Recognition and Localization

Journal

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper