3.8 Proceedings Paper

Fusing RGB and depth with Self-attention for Unseen Object Segmentation

出版社

IEEE
DOI: 10.23919/ICCAS52745.2021.9649991

关键词

Unseen Object Instance Segmentation; Self-attention; Synthetic Dataset; RGB-D Fusion

资金

  1. Korea Institute for Advancement of Technology (KIAT) - Korea Government (MOTIE) [20008613]
  2. Korea Evaluation Institute of Industrial Technology (KEIT) [20008613] Funding Source: Korea Institute of Science & Technology Information (KISTI), National Science & Technology Information Service (NTIS)

向作者/读者索取更多资源

The study introduces a novel Synthetic RGB-D Fusion Mask R-CNN model for unseen object instance segmentation by utilizing a learnable spatial attention estimator. Experimental results demonstrate its state-of-the-art performance in unseen object segmentation.
We present a Synthetic RGB-D Fusion Mask R-CNN (SF Mask R-CNN) for unseen object instance segmentation. Our key idea is to fuse RGB and depth with a learnable spatial attention estimator, named Self-Attention-based Confidence map Estimator (SACE), in four scales upon a category-agnostic instance segmentation model. We pre-trained this SF Mask R-CNN on a large synthetic dataset and evaluated it on a public dataset, WISDOM, after fine-tuning on only a small number of real-world datasets. Our experiments showed the state-of-the-art performance of SACE in unseen object segmentation. Also, we compared the feature maps varying the input modality and fusion method and showed that SACE could be helpful to learn distinctive object-related features. The codes, dataset, and models are available at https://github.com/gist-ailab/SF-Mask-RCNN.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据