4.7 Article

Self-supervised temporal autoencoder for egocentric action segmentation

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

Egocentric Action Recognition by Automatic Relation Modeling

Haoxin Li et al.

Summary: This study proposes a weakly supervised model for egocentric action recognition, which automatically localizes interactors and establishes explicit relation models for recognition without using annotations or prior knowledge. Extensive experiments on egocentric video datasets demonstrate the effectiveness of the proposed method.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Multi-Dataset, Multitask Learning of Egocentric Vision Tasks

Georgios Kapidis et al.

Summary: In this paper, a multitask learning scheme is proposed to address the scarcity of labeled data in egocentric vision tasks like action recognition. Related tasks and datasets are incorporated into the training process, resulting in improved action recognition performance. To overcome the issue of different action labels across datasets, the multitask paradigm is extended to include datasets with different label sets. Experiments on multiple datasets demonstrate the effectiveness of the proposed approach and its ability to automatically discover cross-dataset task correlations.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

Symbiotic Attention for Egocentric Action Recognition With Object-Centric Alignment

Xiaohan Wang et al.

Summary: In this paper, a framework called SAOA is proposed to tackle egocentric action recognition by suppressing background distractors and enhancing action-relevant interactions. The framework introduces two extra sources of information, spatial location and discriminative features of candidate objects, to enable concentration on the occurring interactions. It includes an object-centric feature alignment method and a symbiotic attention mechanism to provide meticulous reasoning between the actor and the environment, achieving state-of-the-art performance on the largest egocentric video dataset.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Article Computer Science, Artificial Intelligence

View-Wise Versus Cluster-Wise Weight: Which Is Better for Multi-View Clustering?

Shizhe Hu et al.

Summary: In this paper, a novel clustering algorithm, CURE, is proposed to automatically learn cluster weights and effectively utilize the complementary information of multi-view data, enhancing clustering performance.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2022)

Article Computer Science, Artificial Intelligence

Egocentric Vision-based Action Recognition: A survey

Adrian Nunez-Marcos et al.

Summary: This article explores the development and research status of egocentric action recognition (EAR) field, including the increase of egocentric video data and the challenge of action recognition. A taxonomy is proposed to classify methods more accurately, a review of zero-shot approaches is provided, and datasets used by researchers are summarized.

NEUROCOMPUTING (2022)

Article Computer Science, Artificial Intelligence

Contrastive predictive coding with transformer for video representation learning

Yue Liu et al.

Summary: This paper presents a novel framework of self-supervised learning for video representation. The framework combines contrastive predictive coding and self-attention, and introduces the Transformer architecture to capture long-range spatio-temporal dependencies. The model achieves state-of-the-art self-supervised performance on UCF101 and HMDB51 datasets.

NEUROCOMPUTING (2022)

Article Automation & Control Systems

DMIB: Dual-Correlated Multivariate Information Bottleneck for Multiview Clustering

Shizhe Hu et al.

Summary: In this study, a novel method for multiview clustering is proposed, which can explore both interfeature correlations and intercluster correlations to improve clustering performance.

IEEE TRANSACTIONS ON CYBERNETICS (2022)

Article Automation & Control Systems

A survey of visual navigation: From geometry to embodied AI

Tianyao Zhang et al.

Summary: This paper emphasizes the importance of information extraction and understanding of unknown environments for mobile robots in navigation. It addresses the problem of traditional visual navigation methods and surveys more than 100 recent papers. Based on a thorough comparison, it categorizes visual navigation methods into two styles and provides mathematical formulations. The paper also discusses various issues and summarizes challenges and future trends.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

Article Automation & Control Systems

Efficient gesture recognition for the assistance of visually impaired people using multi-head neural networks

Samer Alashhab et al.

Summary: This research proposes an interactive system for visually impaired individuals to control mobile devices using hand gestures, allowing them to perform multiple tasks without switching applications. The system utilizes a multi-head neural network and a dataset of images to detect and classify hand gestures, achieving competitive results compared to state-of-the-art methods.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

Proceedings Paper Computer Science, Artificial Intelligence

How Severe Is Benchmark-Sensitivity in Video Self-supervised Learning?

Fida Mohammad Thoker et al.

Summary: This paper investigates the generalization capability of video self-supervised learning models and finds that the current benchmarks are not good indicators of generalization. The study also reveals that self-supervised methods lag behind supervised pre-training when domain shift is large and the amount of available samples is low.

COMPUTER VISION, ECCV 2022, PT XXXIV (2022)

Proceedings Paper Computer Science, Artificial Intelligence

Fast and Unsupervised Action Boundary Detection for Action Segmentation

Zexing Du et al.

Summary: This article proposes an efficient unsupervised action segmentation method by detecting boundaries, named action boundary detection (ABD), to handle the large number of untrimmed videos produced daily. The method has the advantages of no training stage and low-latency inference. By estimating similarities across smoothed frames, the boundary detection task is successfully transformed into change point detection based on similarity. Non-maximum suppression and a clustering algorithm are used to refine the initial proposals. The method achieves state-of-the-art performance and the best trade-off between accuracy and inference time compared to existing unsupervised approaches.

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022) (2022)

Article Computer Science, Artificial Intelligence

Fast Weakly Supervised Action Segmentation Using Mutual Consistency

Yaser Souri et al.

Summary: This paper proposes a novel approach for weakly supervised action segmentation based on a two-branch neural network. The method predicts two redundant but different representations for action segmentation and introduces a novel mutual consistency loss to enforce the consistency between the two representations. The proposed approach achieves state-of-the-art accuracy while being significantly faster to train and during inference.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Proceedings Paper Computer Science, Artificial Intelligence

SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation

Zhe Wang et al.

Summary: This article introduces an unsupervised method called SSCAP for temporal action segmentation and classification in videos. SSCAP utilizes self-supervised learning to extract features and applies a co-occurrence action parsing algorithm to accurately identify and estimate the temporal paths of sub-actions in videos. It achieves state-of-the-art performance on multiple datasets.

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022) (2022)

Article Computer Science, Artificial Intelligence

Predicting the future from first person (egocentric) vision: A survey

Ivan Rodin et al.

Summary: Future prediction from egocentric vision methods can have a significant impact on a range of applications, indicating the need for standardization of tasks and the proposal of datasets considering real-world scenarios.

COMPUTER VISION AND IMAGE UNDERSTANDING (2021)

Article Computer Science, Information Systems

Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies

Yi Yang et al.

Summary: The paper introduces a multiple knowledge representation (MKR) framework and discusses its potential in developing big data artificial intelligence (AI) techniques. MKR makes current AI techniques more explainable and generalizable, while also expanding current AI techniques to facilitate the mutual benefits of different representations.

FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Temporal Action Segmentation from Timestamp Supervision

Zhe Li et al.

Summary: Recent advances in temporal action segmentation have been successful, but annotating videos with frame-wise labels is costly. This paper proposes using timestamp supervision, which requires comparable annotation effort to weakly supervised methods but provides a stronger signal. By training a segmentation model using only timestamps annotations, our approach effectively captures action changes and achieves performance comparable to fully supervised approaches.

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

M. Saquib Sarfraz et al.

Summary: The proposed method presents a fully automatic and unsupervised approach for action segmentation in videos, by utilizing a temporally-weighted hierarchical clustering algorithm to group semnatically consistent frames effectively. It does not require training and has achieved significant performance improvements on challenging action segmentation datasets.

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Action Shuffle Alternating Learning for Unsupervised Action Segmentation

Jun Li et al.

Summary: This paper presents a method for unsupervised action segmentation using self-supervised learning and a Hidden Markov Model (HMM) to model action lengths. The proposed approach achieves superior results in action segmentation compared to state-of-the-art methods on various datasets.

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)

Proceedings Paper Computer Science, Artificial Intelligence

Joint Visual-Temporal Embedding for Unsupervised Learning of Actions in Untrimmed Sequences

Rosaura G. VidalMata et al.

Summary: This study proposes an approach for the unsupervised learning of actions in untrimmed video sequences based on a joint visual-temporal embedding space, which allows detecting relevant action clusters through a combination of visual embedding and temporal continuous function.

2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021) (2021)

Proceedings Paper Computer Science, Artificial Intelligence

N2D: (Not Too) Deep Clustering via Clustering the Local Manifold of an Autoencoded Embedding

Ryan McConville et al.

Summary: Deep clustering algorithms, which combine representation learning with deep neural networks, have shown superior performance compared to conventional shallow clustering algorithms. By jointly optimizing a clustering and non-clustering loss, these algorithms achieve high-quality clustering results.

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) (2021)

Article Computer Science, Information Systems

Learning Representations for High-Dynamic-Range Image Color Transfer in a Self-Supervised Way

Yifei Huang et al.

Summary: This paper proposes an innovative high-dynamic-range image color transfer generative adversarial network (HDRCTGAN) that learns fine image representations through self-supervised learning for transferring colors from reference images to target images. The method requires only unlabeled HDR images for training, instead of supervised learning with many ground truth pairs, resulting in pleasing visual results.

IEEE TRANSACTIONS ON MULTIMEDIA (2021)

Article Automation & Control Systems

A comprehensive overview of smart wearables: The state of the art literature, recent advances, and future challenges

Naghmeh Niknejad et al.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2020)

Article Computer Science, Artificial Intelligence

An Information Maximization Multi-task Clustering Method for egocentric temporal segmentation

Mingming Zhang et al.

APPLIED SOFT COMPUTING (2020)

Article Engineering, Electrical & Electronic

MUGGLE: MUlti-Stream Group Gaze Learning and Estimation

Ning Zhuang et al.

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY (2020)

Article Computer Science, Artificial Intelligence

Temporal Segment Networks for Action Recognition in Videos

Limin Wang et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2019)

Article Computer Science, Artificial Intelligence

Deep Attention Network for Egocentric Action Recognition

Minlong Lu et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Efficient Parameter-free Clustering Using First Neighbor Relations

M. Saquib Sarfraz et al.

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

Unsupervised learning of action classes with continuous temporal embedding

Anna Kukleva et al.

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

A Perceptual Prediction Framework for Self Supervised Event Segmentation

Sathyanarayanan N. Aakur et al.

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) (2019)

Article Computer Science, Artificial Intelligence

Egocentric Temporal Action Proposals

Shao Huang et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2018)

Proceedings Paper Computer Science, Artificial Intelligence

Unsupervised Learning and Segmentation of Complex Activities from Video

Fadime Sener et al.

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) (2018)

Article Computer Science, Information Systems

Efficient Unsupervised Temporal Segmentation of Motion Data

Bjoern Krueger et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2017)

Article Computer Science, Artificial Intelligence

Organizing egocentric videos of daily living activities

Alessandro Ortis et al.

PATTERN RECOGNITION (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Multi-Task Clustering of Human Actions by Sharing Information

Xiaoqiang Yan et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Article Computer Science, Artificial Intelligence

Egocentric Daily Activity Recognition via Multitask Clustering

Yan Yan et al.

IEEE TRANSACTIONS ON IMAGE PROCESSING (2015)

Proceedings Paper Computer Science, Artificial Intelligence

Unsupervised Visual Representation Learning by Context Prediction

Carl Doersch et al.

2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) (2015)

Review Computer Science, Artificial Intelligence

A Survey on Visual Content-Based Video Indexing and Retrieval

Weiming Hu et al.

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS (2011)

Article Computer Science, Artificial Intelligence

Normalized cuts and image segmentation

JB Shi et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2000)