☆ 4.8 Article

Paying Attention to Video Object Pattern Understanding

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2021)

期刊

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

卷 43, 期 7, 页码 2413-2428

出版社

IEEE COMPUTER SOC

DOI: 10.1109/TPAMI.2020.2966453

关键词

Visualization; Object segmentation; Motion segmentation; Task analysis; Annotations; Biological system modeling; Image segmentation; Video object pattern understanding; unsupervised video object segmentation; top-down visual attention; video salient object detection

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

Beijing Natural Science Foundation [4182056]
CCF-Tencent Open Fund
Tencent AI Lab Rhino-Bird Focused Research Program
Zhijiang Lab's International Talent Fund for Young Professionals
Joint Building Program of Beijing Municipal Education Commission
Yahoo Faculty Research and Engagement Program Award
Amazon AWS Machine Learning Research Award

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper conducts a systematic study on the role of visual attention in video object pattern understanding. It quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgments during dynamic, task-driven viewing. The novel observations provide an in-depth insight of the underlying rationale behind video object patterns.

This paper conducts a systematic study on the role of visual attention in video object pattern understanding. By elaborately annotating three popular video segmentation datasets (DAVIS) with dynamic eye-tracking data in the unsupervised video object segmentation (UVOS) setting. For the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgments during dynamic, task-driven viewing. Such novel observations provide an in-depth insight of the underlying rationale behind video object pattens. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major advantages: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on four popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance compared with state-of-the-arts and enjoys fast processing speed (10 fps on a single GPU). Our collected eye-tracking data and algorithm implementations have been made publicly available athttps://github.com/wenguanwang/AGS.

Paying Attention to Video Object Pattern Understanding

期刊

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Paying Attention to Video Object Pattern Understanding

期刊

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文