☆ 4.7 Article

Mask encoding: A general instance mask representation for object segmentation

PATTERN RECOGNITION (2022)

期刊

PATTERN RECOGNITION

卷 124, 期 -, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2021.108505

关键词

Mask encoding; Instance segmentation; Video instance segmentation

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

National Natural Science Foundation of China [62073244]
Shanghai Innovation Action Plan [20511100500, 20511105802]
Innovation Program of Shanghai Municipal Education Commission [202101070007E00098]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Instance segmentation is a challenging task in computer vision that requires separating each instance at the pixel level. Current dominant representation for instance masks is a low-resolution binary mask. This work proposes an effective approach to encode high-resolution structured masks into a compact representation that combines high quality and low complexity. The proposed method can be easily integrated into existing pipelines and improves the mask average precision (AP) on various datasets.

Instance segmentation is one of the most challenging tasks in computer vision, which requires separating each instance in pixels. To date, a low-resolution binary mask is the dominant paradigm for representation of instance mask. For example, the size of the predicted mask in Mask R-CNN is usually 28 x 28 . Generally, a low-resolution mask can not capture the object details well, while a high-resolution mask dramatically increases the training complexity. In this work, we propose a flexible and effective approach to encode the high-resolution structured mask to the compact representation which shares the advantages of high-quality and low-complexity. The proposed mask representation can be easily integrated into two-stage pipelines such as Mask R-CNN, improving mask AP by 0.9% on the COCO dataset, 1.4% on the LVIS dataset, and 2.1% on the Cityscapes dataset. Moreover, a novel single shot instance segmentation framework can be constructed by extending the existing one-stage detector with a mask branch for this instance representation. Our model shows its superiority over the explicit contour-based pipelines in accuracy with similar computational complexity. We also evaluate our method for video instance segmentation, achieving promising results on YouTube-VIS dataset. Code is available at: https://git.io/AdelaiDet (c) 2021 Elsevier Ltd. All rights reserved.

Mask encoding: A general instance mask representation for object segmentation

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Mask encoding: A general instance mask representation for object segmentation

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文