4.4 Article

CR-Net: Robot grasping detection method integrating convolutional block attention module and residual module

期刊

IET COMPUTER VISION
卷 -, 期 -, 页码 -

出版社

WILEY
DOI: 10.1049/cvi2.12252

关键词

computer vision; convolution; convolutional neural nets

向作者/读者索取更多资源

In this paper, a novel lightweight grasping detection model is proposed, which addresses the theoretical challenges associated with grasping detection by incorporating attention mechanisms and residual modules. The model achieves remarkable detection accuracy rates on the Cornell Grasp dataset, demonstrating its exceptional performance in overcoming the theoretical complexities of grasping detection.
Grasping detection, which involves identifying and assessing the grasp ability of objects by robotic systems, has garnered significant attention in recent years due to its pivotal role in the development of robotic systems and automated assembly processes. Despite notable advancements in this field, current methods often grapple with both practical and theoretical challenges that hinder their real-world applicability. These challenges encompass low detection accuracy, the burden of oversized model parameters, and the inherent complexity of real-world scenarios. In response to these multifaceted challenges, a novel lightweight grasping detection model that not only addresses the technical aspects but also delves into the underlying theoretical complexities is introduced. The proposed model incorporates attention mechanisms and residual modules to tackle the theoretical challenges posed by varying object shapes, sizes, materials, and environmental conditions. To enhance its performance in the face of these theoretical complexities, the proposed model employs a Convolutional Block Attention Module (CBAM) to extract features from RGB and depth channels, recognising the multifaceted nature of object properties. Subsequently, a feature fusion module effectively combines these diverse features, providing a solution to the theoretical challenge of information integration. The model then processes the fused features through five residual blocks, followed by another CBAM attention module, culminating in the generation of three distinct images representing capture quality, grasping angle, and grasping width. These images collectively yield the final grasp detection results, addressing the theoretical complexities inherent in this task. The proposed model's rigorous training and evaluation on the Cornell Grasp dataset demonstrate remarkable detection accuracy rates of 98.44% on the Image-wise split and 96.88% on the Object-wise split. The experimental results strongly corroborate the exceptional performance of the proposed model, underscoring its ability to overcome the theoretical challenges associated with grasping detection. The integration of the residual module ensures rapid training, while the attention module facilitates precise feature extraction, ultimately striking an effective balance between detection time and accuracy. In order to solve the problem of low detection accuracy in robot grasping detection, a new model is proposed in this paper, which uses the attention module to extract RGB and depth channels. Then, the feature fusion module is used to combine these features effectively. The model then passes the fused features through five remaining blocks and another attention module, and then generates three images through three convolution transpose layers, representing capture quality, grab Angle, and grab width. Finally, the output grab detection results are obtained.image

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据