☆ 3.8 Proceedings Paper

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 (2021)

期刊

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021

卷 -, 期 -, 页码 8716-8725

出版社

IEEE COMPUTER SOC

DOI: 10.1109/CVPR46437.2021.00861

关键词

类别

Computer Science, Artificial Intelligence Imaging Science & Photographic Technology

资金

Alibaba Innovative Research (AIR) Program
Major Scientific Research Project of Zhejiang Lab [2019DB0ZX01]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The paper introduces a new method of encoding high-resolution binary grid masks using discrete cosine transform (DCT), named DCT-Mask. This method shows significant gains in various frameworks, backbones, datasets, and training schedules, with minimal impact on running speed. The success of DCT-Mask lies in its ability to achieve high-quality mask representation with low complexity.

Binary grid mask representation is broadly used in instance segmentation. A representative instantiation is Mask R-CNN which predicts masks on a 28 x 28 binary grid. Generally, a low-resolution grid is not sufficient to capture the details, while a high-resolution grid dramatically increases the training complexity. In this paper, we propose a new mask representation by applying the discrete cosine transform(DCT) to encode the high-resolution binary grid mask into a compact vector. Our method, termed DCT-Mask, could be easily integrated into most pixel-based instance segmentation methods. Without any bells and whistles, DCT-Mask yields significant gains on different frameworks, backbones, datasets, and training schedules. It does not require any pre-processing or pre-training, and almost no harm to the running speed. Especially, for higher-quality annotations and more complex backbones, our method has a greater improvement. Moreover, we analyze the performance of our method from the perspective of the quality of mask representation. The main reason why DCT-Mask works well is that it obtains a high-quality mask representation with low complexity.

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

期刊

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

期刊

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021

出版社

IEEE COMPUTER SOC

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文