4.7 Article

The Right Spin: Learning Object Motion from Rotation-Compensated Flow Fields

期刊

出版社

SPRINGER
DOI: 10.1007/s11263-023-01859-x

关键词

Motion segmentation; Video segmentation; Optical flow; Camera motion estimation

向作者/读者索取更多资源

A good understanding of geometry and familiarity with objects contribute to the reliable perception of moving objects. Human vision and computer vision differ in their approaches to this problem, with human vision coupling cognitive processes and body design, while computer vision relies on deep networks. The coupling of camera rotation and translation creates complex motion fields that are challenging for deep networks to untangle directly. This study presents a probabilistic model to estimate camera rotation and rectify the flow field for improved motion segmentation, yielding better results on benchmark tests.
A good understanding of geometrical concepts as well as a broad familiarity with objects lead to excellent human perception of moving objects. The human ability to detect and segment moving objects works in the presence of multiple objects, complex background geometry, motion of the observer and even camouflage. How we perceive moving objects so reliably is a longstanding research question in computer vision and borrows findings from related areas such as psychology, cognitive science and physics. One approach to the problem is to teach a deep network to model all of these effects. This is in contrast with the strategy used by human vision, where cognitive processes and body design are tightly coupled and each is responsible for certain aspects of correctly identifying moving objects. Similarly, from the computer vision perspective there is evidence that classical, geometry-based techniques are better suited to the motion-based parts of the problem, while deep networks are more suitable for modeling appearance. In this work, we argue that the coupling of camera rotation and camera translation can create complex motion fields that are difficult for a deep network to untangle directly. We present a novel probabilistic model to estimate the camera's rotation given the motion field. We then rectify the flow field to obtain a rotation-compensated motion field for subsequent segmentation. This strategy of first estimating camera motion, and then allowing a network to learn the remaining parts of the problem, yields improved results on the widely used DAVIS benchmark as well as the more recent motion segmentation data set MoCA (Moving Camouflaged Animals).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据