4.7 Article

Active learning with effective scoring functions for semi-supervised temporal action localization

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

Equivalent Classification Mapping for Weakly Supervised Temporal Action Localization

Tao Zhao et al.

Summary: Weakly supervised temporal action localization is a widely studied topic in recent years, with existing methods categorized into pre-classification and post-classification pipelines. In this study, a unified framework is proposed to simultaneously learn these two pipelines using two parallel network streams and a shared classifier. The Equivalent Classification Mapping (ECM) mechanism is introduced to achieve accurate action localization results. The proposed framework also incorporates a weight-transition module and an equivalent training strategy to thoroughly mine the equivalence mechanism. Comprehensive experiments on three benchmarks demonstrate the effectiveness of ECM.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Human Action Recognition From Various Data Modalities: A Review

Zehua Sun et al.

Summary: Human Action Recognition (HAR) aims to understand human behavior and assign labels to actions. Various data modalities, such as RGB, skeleton, depth, infrared, point cloud, event stream, audio, acceleration, radar, and WiFi signal, can be used to represent human actions. Many studies have investigated different approaches for HAR using these modalities. This article presents a comprehensive survey of recent progress in deep learning methods for HAR based on input data modality, covering single and multiple modalities and fusion-based and co-learning-based frameworks.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Engineering, Electrical & Electronic

Exploring Sub-Action Granularity for Weakly Supervised Temporal Action Localization

Binglu Wang et al.

Summary: Modeling cross-video relationship at the sub-action granularity is proposed in this work for weakly supervised temporal action localization. By representing video features through a group of sub-actions (sub-action family) shared among all videos in the dataset, complicated sampling strategies can be eliminated. Additional top-down branch and consistency loss are introduced to learn feature vectors and guide the learning process. Experimental results on benchmark datasets demonstrate the high performance of the proposed method.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

Article Engineering, Electrical & Electronic

Learning Video Moment Retrieval Without a Single Annotated Video

Junyu Gao et al.

Summary: This paper proposes an alternative approach to video moment retrieval that does not require textual annotations of videos. It leverages existing visual concept detectors and a pre-trained image-sentence embedding space. By utilizing a video-conditioned sentence generator, a GNN-based relation-aware moment localizer, and a pre-trained image-sentence embedding space, the proposed method achieves effective video moment retrieval.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2022)

Article Computer Science, Artificial Intelligence

Background-Click Supervision for Temporal Action Localization

Le Yang et al.

Summary: This study proposes a novel method called BackTAL, which converts action-click supervision to background-click supervision, and trains a stronger action localizer on background video frames. BackTAL implements two-fold modeling on the background frames and dynamically attends to informative neighbors during temporal convolution. Extensive experiments demonstrate the high performance of BackTAL and the rationality of the proposed background-click supervision.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Computer Science, Theory & Methods

A Survey of Deep Active Learning

Pengzhen Ren et al.

Summary: Researchers have shown relatively lower interest in active learning compared to deep learning, but with the increasing demand for large-scale high-quality annotated datasets, active learning is receiving more attention. This article provides a comprehensive survey on deep active learning, including a formal classification method, an overview of existing work, and an analysis of developments from an application perspective.

ACM COMPUTING SURVEYS (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Semi-supervised Temporal Action Detection with Proposal-Free Masking

Sauradip Nag et al.

Summary: Existing temporal action detection methods heavily rely on annotated training data, which is expensive to collect. Semi-supervised temporal action detection (SS-TAD) addresses this issue by utilizing unlabeled videos. However, SS-TAD is a challenging and under-studied problem. In this work, we propose a novel SS-TAD model called SPOT, which eliminates the dependence between localization and classification through parallel architecture and introduces interaction mechanism and self-supervised pre-training. Our experiments demonstrate that SPOT outperforms state-of-the-art alternatives by a large margin.

COMPUTER VISION - ECCV 2022, PT III (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Weakly Supervised Temporal Action Localization with Segment-Level Labels

Xinpeng Ding et al.

Summary: The paper introduces a new segment-level supervision setting and proposes a local segment loss and propagation loss to address the temporal action localization problem.

PATTERN RECOGNITION AND COMPUTER VISION, PT I (2021)

Article Computer Science, Artificial Intelligence

Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization

Guozhang Li et al.

Summary: A novel supervision method named multi-hierarchical category supervision (MHCS) is introduced in this paper to encourage the model to pay more attention to common sub-actions rather than only discriminative ones in Weakly Supervised Temporal Action Localization (WTAL). By constructing super-classes through hierarchical clustering, the MHCS method improves the performance of WTAL models by finding more sub-actions shared among action categories.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Article Computer Science, Artificial Intelligence

KFC: An Efficient Framework for Semi-Supervised Temporal Action Localization

Xinpeng Ding et al.

Summary: In this paper, a method named K-farthest crossover is proposed for semi-supervised learning in TAL, which helps construct perturbations based on video features. The method adds perturbations to features along the temporal axis and adopts CR to encourage the model to retain the observation of similarity and difference between features.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2021)

Proceedings Paper Acoustics

REGRESSION BEFORE CLASSIFICATION FOR TEMPORAL ACTION DETECTION

Cece Jin et al.

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (2020)

Article Computer Science, Information Systems

Temporal Action Localization in Untrimmed Videos Using Action Pattern Trees

Hao Song et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Active Learning for Deep Detection Neural Networks

Hamed H. Aghdam et al.

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Learning Temporal Action ProposalsWith Fewer Labels

Jingwei Ji et al.

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019) (2019)

Proceedings Paper Computer Science, Software Engineering

DECOUPLING LOCALIZATION AND CLASSIFICATION IN SINGLE SHOT TEMPORAL ACTION DETECTION

Yupan Huang et al.

2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME) (2019)

Article Engineering, Electrical & Electronic

Cost-Effective Active Learning for Deep Image Classification

Keze Wang et al.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2017)

Proceedings Paper Computer Science, Artificial Intelligence

TURN TAP: Temporal Unit Regression Network for Temporal Action Proposals

Jiyang Gao et al.

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Joao Carreira et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Soft-NMS - Improving Object Detection With One Line of Code

Navaneeth Bodla et al.

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2017)

Article Education & Educational Research

Does active learning work? A review of the research

M Prince

JOURNAL OF ENGINEERING EDUCATION (2004)