☆ 4.7 Article

Category-Level 6-D Object Pose Estimation With Shape Deformation for Robotic Grasp Detection

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

卷 -, 期 -, 页码 -

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TNNLS.2023.3330011

关键词

Pose estimation; Feature extraction; Point cloud compression; Deformation; Shape; Transformers; Solid modeling; Category-level object pose estimation; robotic grasp; shape deformation; transformer

类别

Computer Science, Artificial Intelligence Computer Science, Hardware & Architecture Computer Science, Theory & Methods Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes a category-level object pose estimation network trained on synthetic data but capable of delivering good performance on real datasets. By introducing fusion and attention modules, the network enhances prediction accuracy. Through self-supervised learning and a small amount of real data supplementation, the method achieves high-precision pose estimation in real-world scenes.

Category-level 6-D object pose estimation plays a crucial role in achieving reliable robotic grasp detection. However, the disparity between synthetic and real datasets hinders the direct transfer of models trained on synthetic data to real-world scenarios, leading to ineffective results. Additionally, creating large-scale real datasets is a time-consuming and labor-intensive task. To overcome these challenges, we propose CatDeform, a novel category-level object pose estimation network trained on synthetic data but capable of delivering good performance on real datasets. In our approach, we introduce a transformer-based fusion module that enables the network to leverage multiple sources of information and enhance prediction accuracy through feature fusion. To ensure proper deformation of the prior point cloud to align with scene objects, we propose a transformer-based attention module that deforms the prior point cloud from both geometric and feature perspectives. Building upon CatDeform, we design a two-branch network for supervised learning, bridging the gap between synthetic and real datasets and achieving high-precision pose estimation in real-world scenes using predominantly synthetic data supplemented with a small amount of real data. To minimize reliance on large-scale real datasets, we train the network in a self-supervised manner by estimating object poses in real scenes based on the synthetic dataset without manual annotation. We conduct training and testing on CAMERA25 and REAL275 datasets, and our experimental results demonstrate that the proposed method outperforms state-of-the-art (SOTA) techniques in both self-supervised and supervised training paradigms. Finally, we apply CatDeform to object pose estimation and robotic grasp experiments in real-world scenarios, showcasing a higher grasp success rate.

Category-Level 6-D Object Pose Estimation With Shape Deformation for Robotic Grasp Detection

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Category-Level 6-D Object Pose Estimation With Shape Deformation for Robotic Grasp Detection

期刊

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文