4.6 Article

iLabel: Revealing Objects in Neural Fields

Journal

IEEE ROBOTICS AND AUTOMATION LETTERS
Volume 8, Issue 2, Pages 832-839

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LRA.2022.3231498

Keywords

Deep learning for visual perception; representation learning; semantic scene understanding; SLAM

Categories

Ask authors/readers for more resources

A neural field trained with self-supervision efficiently represents the geometry and colour of a 3D scene and automatically decomposes it into coherent and accurate object-like regions. By using sparse labelling interactions, a 3D semantic scene segmentation can be produced. Our real-time iLabel system, which takes input from a hand-held RGB-D camera, requires no prior training data, and works in an 'open set' manner, allows users to define semantic classes on the fly. The underlying model of iLabel is a simple multilayer perceptron (MLP), trained from scratch to learn a neural representation of a single 3D scene.
A neural field trained with self-supervision to efficiently represent the geometry and colour of a 3D scene tends to automatically decompose it into coherent and accurate object-like regions, which can be revealed with sparse labelling interactions to produce a 3D semantic scene segmentation. Our real-time iLabel system takes input from a hand-held RGB-D camera, requires zero prior training data, and works in an 'open set' manner, with semantic classes defined on the fly by the user. iLabel's underlying model is a simple multilayer perceptron (MLP), trained from scratch to learn a neural representation of a single 3D scene. The model is updated continually and visualised in real-time, allowing the user to focus interactions to achieve extremely efficient semantic segmentation. A room-scale scene can be accurately labelled into 10+ semantic categories with around 100 clicks, taking less than 5 minutes. Quantitative labelling accuracy scales powerfully with the number of clicks, and rapidly surpasses standard pre-trained semantic segmentation methods. We also demonstrate a hierarchical labelling variant of iLabel and a 'hands-free' mode where the user only needs to supply label names for automatically-generated locations.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available