Journal
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021)
Volume -, Issue -, Pages 7157-7168Publisher
IEEE
DOI: 10.1109/ICCV48922.2021.00709
Keywords
-
Funding
- Google-DeepMind Studentship
- Royal Society Research Professorship
- UK EPSRC Programme Grant Visual AI [EP/T028572/1]
- UK EPSRC CDT in AIMS
- Schlumberger Studentship
Ask authors/readers for more resources
This paper introduces a simple variant of the Transformer to segment optical flow frames into primary objects and background in a self-supervised manner. Despite using only optical flow as input, the approach achieves superior results compared to previous state-of-the-art self-supervised methods, and is significantly faster.
Animals have evolved highly functional visual systems to understand motion, assisting perception even under complex environments. In this paper, we work towards developing a computer vision system able to segment objects by exploiting motion cues, i.e. motion segmentation. To achieve this, we introduce a simple variant of the Transformer to segment optical flow frames into primary objects and the background, which can be trained in a self-supervised manner, i.e. without using any manual annotations. Despite using only optical flow, and no appearance information, as input, our approach achieves superior results compared to previous state-of-the-art self-supervised methods on public benchmarks (DAVIS2016, SegTrackv2, FBMS59), while being an order of magnitude faster. On a challenging camouflage dataset (MoCA), we significantly outperform other self-supervised approaches, and are competitive with the top supervised approach, highlighting the importance of motion cues and the potential bias towards appearance in existing video segmentation models.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available