期刊
出版社
IEEE
DOI: 10.1109/SIBGRAPI.2018.00020
关键词
-
资金
- Samsung Eletronica da Amazonia Ltda. [8.248/91]
Video understanding is the next frontier of computer vision, in which activity recognition plays a major role. Despite the recent improvements in holistic activity recognition, further researching part-based models such as context may allow us to better understand what is important for activities and thus improve our current activity recognition models. This work tackles contextual cues obtained from object detections, in which we posit that objects relevant to an action are related to its spatial arrangement regarding an agent. Based on that, we propose Egocentric Pyramid to encode such spatial relationships. We further extend it by proposing a data-centric approach named Temporal Segment Relational Network (TSRN). Our experiments give support to the hypothesis that object spatiality provides an important clue to activity recognition. In addition, our data-centric approach shows that besides such spatial features, there may be other important information that further enhances the object-based activity recognition, such as co-occurrence, relative size, and temporal information.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据