4.6 Article

Robust Data Association Against Detection Deficiency for Semantic SLAM

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TASE.2022.3233662

关键词

Cameras; Simultaneous localization and mapping; Semantics; Trajectory; Feature extraction; Three-dimensional displays; Target tracking; Data association; multiple object tracking; semantic SLAM; local homography

向作者/读者索取更多资源

In this paper, a 2D motion inference method based on local projective warping consistency is proposed, along with an object association method called HOA that integrates deep appearance feature and semantic information. The proposed methods enhance the accuracy and robustness of object association under detection deficiency.
Robust and accurate object association is essential for precise 3D object landmark inference in semantic Simultaneous Localization and Mapping (SLAM), and yet remains challenging due to the detection deficiency caused by high miss detection rate, false alarm, occlusion and limited field-of-view, etc. The 2D location of an object is a crucial complementary cue to the appearance feature, especially in the case of associating objects across frames under large viewpoint changes. However, motion model or trajectory pattern based methods struggle to infer object motion reliably with a moving camera. In this paper, by exploiting the local projective warping consistency, a local homography based 2D motion inference method is proposed to sequentially estimate the object location along with uncertainty. By integrating the deep appearance feature and semantic information, an object association method, named HOA, which is robust to detection deficiency is proposed. Experimental evaluations suggest that the proposed motion prediction method is capable of maintaining a low cumulative error over a long duration, which enhances the object association performance in both accuracy and robustness. Note to Practitioners-This work aims to consistently associate 2D detection boxes corresponding to the same 3D object across images. In tasks of landmark-based navigation, collision avoidance, grasping and manipulation, objects in the task space are commonly simplified into 3D enveloping surfaces (e.g. cuboid or ellipsoid) by using 2D object detection boxes from multiple image views, and accurate data association is a prerequisite for precise enveloping surface reconstruction. This problem remains challenging considering the imperfect object detections, the appearance similarity of objects and the unpredictable trajectory of the moving camera. This work proposes a long-term reliable 2D location prediction algorithm that is capable of handling the complex motion of the target. Along with the appearance feature extracted by a retrain-free deep learning based model, this work proposes an object association method that can simultaneously deal with multiple objects with unknown object categories under the moving camera scenario.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据