4.7 Article

DRNet: Double Recalibration Network for Few-Shot Semantic Segmentation

期刊

IEEE TRANSACTIONS ON IMAGE PROCESSING
卷 31, 期 -, 页码 6733-6746

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIP.2022.3215905

关键词

Few-shot learning; semantic segmentation; recalibration; weakly-supervised learning

资金

  1. National Natural Science Foundation of China [61972036]
  2. Beijing Academy of Artificial Intelligence (BAAI)

向作者/读者索取更多资源

DRNet proposes a Double Recalibration Network with two recalibration modules to enhance model robustness against intra-class variance. The method can more accurately mine target regions for query images.
Few-shot segmentation aims at learning to segment query images guided by only a few annotated images from the support set. Previous methods rely on mining the feature embedding similarity across the query and the support images to achieve successful segmentation. However, these models tend to perform badly in cases where the query instances have a large variance from the support ones. To enhance model robustness against such intra-class variance, we propose a Double Recalibration Network (DRNet) with two recalibration modules, i.e., the Self-adapted Recalibration (SR) module and the Cross-attended Recalibration (CR) module. In particular, beyond learning robust feature embedding for pixel-wise comparison between support and query as in conventional methods, the DRNet further exploits semantic-aware knowledge embedded in the query image to help segment itself, which we call 'self-adapted recalibration'. More specifically, DRNet first employs guidance from the support set to roughly predict an incomplete but correct initial object region for the query image, and then reversely uses the feature embedding extracted from the incomplete object region to segment the query image. Also, we devise a CR module to refine the feature representation of the query image by propagating the underlying knowledge embedded in the support image's foreground to the query. Instead of foreground global pooling, we refine the response at each pixel in the query feature map by attending to all foreground pixels in the support feature map and taking the weighted average by their similarity; meanwhile, feature maps of the query image are also added back to weighted feature maps as a residual connection. Our DRNet can effectively address the intra-class variance under the few-shot setting with such two recalibration modules, and mine more accurate target regions for query images. We conduct extensive experiments on the popular benchmarks PASCAL-5(i) and COCO-20(i). The DRNet with the best configuration achieves the mIoU of 63.6% and 64.9% on PASCAL-5(i) and 44.7% and 49.6% on COCO-20(i) for 1-shot and 5-shot settings respectively, significantly outperforming the state-of-the-arts without any bells and whistles. Code is available at: https://github.com/fangzy97/drnet.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据