4.7 Article

Domain-Invariant Similarity Activation Map Contrastive Learning for Retrieval-Based Long-Term Visual Localization

Journal

IEEE-CAA JOURNAL OF AUTOMATICA SINICA
Volume 9, Issue 2, Pages 313-328

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JAS.2021.1003907

Keywords

Deep representation learning; place recognition; visual localization

Ask authors/readers for more resources

This paper proposes a coarse-to-fine localization method based on image retrieval, which uses multi-domain image translation and gradient-weighted similarity activation mapping loss to extract domain-invariant features and improve localization accuracy. Experiments demonstrate the effectiveness and strong generalization ability of the proposed method in challenging environments.
Visual localization is a crucial component in the application of mobile robot and autonomous driving. Image retrieval is an efficient and effective technique in image-based localization methods. Due to the drastic variability of environmental conditions, e.g., illumination changes, retrieval-based visual localization is severely affected and becomes a challenging problem. In this work, a general architecture is first formulated probabilistically to extract domain-invariant features through multi-domain image translation. Then, a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy. We also propose a new adaptive triplet loss to boost the contrastive learning of the embedding in a self-supervised manner. The final coarse-to-fine image retrieval pipeline is implemented as the sequential combination of models with and without Grad-SAM loss. Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMU-Seasons dataset. The strong generalization ability of our approach is verified with the RobotCar dataset using models pre-trained on urban parts of the CMU-Seasons dataset. Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision, especially under challenging environments with illumination variance, vegetation, and night-time images. Moreover, real-site experiments have been conducted to validate the efficiency and effectiveness of the coarse-to-fine strategy for localization.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available