4.6 Article

A Visual and VAE Based Hierarchical Indoor Localization Method

期刊

SENSORS
卷 21, 期 10, 页码 -

出版社

MDPI
DOI: 10.3390/s21103406

关键词

indoor localization; computer vision (CV); variational autoencoder (VAE)

资金

  1. National Natural Science Foundation of China [61873274]

向作者/读者索取更多资源

In this paper, an unsupervised hierarchical indoor localization framework combining unsupervised network variational autoencoder and visual-based SfM approach is proposed to extract global and local features for precise image localization and pose estimation. By using global features for image retrieval at the level of scene map and subsequently estimating pose through 2D-3D matches, the proposed method achieves promising results in accuracy and efficiency.
Precise localization and pose estimation in indoor environments are commonly employed in a wide range of applications, including robotics, augmented reality, and navigation and positioning services. Such applications can be solved via visual-based localization using a pre-built 3D model. The increase in searching space associated with large scenes can be overcome by retrieving images in advance and subsequently estimating the pose. The majority of current deep learning-based image retrieval methods require labeled data, which increase data annotation costs and complicate the acquisition of data. In this paper, we propose an unsupervised hierarchical indoor localization framework that integrates an unsupervised network variational autoencoder (VAE) with a visual-based Structure-from-Motion (SfM) approach in order to extract global and local features. During the localization process, global features are applied for the image retrieval at the level of the scene map in order to obtain candidate images, and are subsequently used to estimate the pose from 2D-3D matches between query and candidate images. RGB images only are used as the input of the proposed localization system, which is both convenient and challenging. Experimental results reveal that the proposed method can localize images within 0.16 m and 4 degrees in the 7-Scenes data sets and 32.8% within 5 m and 20 degrees in the Baidu data set. Furthermore, our proposed method achieves a higher precision compared to advanced methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据