☆ 4.5 Article

Social Context-aware Person Search in Videos via Multi-modal Cues

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2022)

期刊

ACM TRANSACTIONS ON INFORMATION SYSTEMS

卷 40, 期 3, 页码 -

出版社

ASSOC COMPUTING MACHINERY

DOI: 10.1145/3480967

关键词

Person search; graph modeling; user profile; label propagation; social relation; neural network

类别

Computer Science, Information Systems

资金

National Key Research and Development Program of China [2018YFB1402600]
National Natural Science Foundation of China [62072423]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This article proposes a social context-aware framework that fuses visual and social contexts to profile persons in more semantic perspectives and achieves better performance in handling person search task in complex scenarios. The framework segments videos into independent scene units, abstracts social contexts, constructs inner-personal links through a graph formulation operation, and performs relation-aware label propagation to identify characters' occurrences. Experiments demonstrate that the proposed solution outperforms competitive baselines on real-world datasets.

Person search has long been treated as a crucial and challenging task to support deeper insight in personalized summarization and personality discovery. Traditional methods, e.g., person re-identification and face recognition techniques, which profile video characters based on visual information, are often limited by relatively fixed poses or small variation of viewpoints and suffer from more realistic scenes with high motion complexity (e.g., movies). At the same time, long videos such as movies often have logical story lines and are composed of continuously developmental plots. In this situation, different persons usually meet on a specific occasion, in which informative social cues are performed. We notice that these social cues could semantically profile their personality and benefit person search task in two aspects. First, persons with certain relationships usually co-occur in short intervals; in case one of them is easier to be identified, the social relation cues extracted from their co-occurrences could further benefit the identification for the harder ones. Second, social relations could reveal the association between certain scenes and characters (e.g., classmate relationship may only exist among students), which could narrow down candidates into certain persons with a specific relationship. In this way, high-level social relation cues could improve the effectiveness of person search. Along this line, in this article, we propose a social context-aware framework, which fuses visual and social contexts to profile persons in more semantic perspectives and better deal with person search task in complex scenarios. Specifically, we first segment videos into several independent scene units and abstract out social contexts within these scene units. Then, we construct inner-personal links through a graph formulation operation for each scene unit, in which both visual cues and relation cues are considered. Finally, we perform a relation-aware label propagation to identify characters' occurrences, combining low-level semantic cues (i.e., visual cues) and high-level semantic cues (i.e., relation cues) to further enhance the accuracy. Experiments on real-world datasets validate that our solution outperforms several competitive baselines.

Social Context-aware Person Search in Videos via Multi-modal Cues

期刊

ACM TRANSACTIONS ON INFORMATION SYSTEMS

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Social Context-aware Person Search in Videos via Multi-modal Cues

期刊

ACM TRANSACTIONS ON INFORMATION SYSTEMS

出版社

ASSOC COMPUTING MACHINERY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文