☆ 4.6 Article

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

IEEE ROBOTICS & AUTOMATION MAGAZINE (2023)

Journal

IEEE ROBOTICS & AUTOMATION MAGAZINE

Volume -, Issue -, Pages -

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/MRA.2023.3271203

Keywords

Wind; Navigation; Buoyancy; Aerospace electronics; Encoding; Atmospheric modeling; Training

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

Autonomous ballooning presents challenges for planning and control algorithms due to limited control capabilities, the stochastic nature of balloon flight caused by wind, and the difficulty of sensing wind remotely. This study uses reinforcement learning to develop a control policy for autonomous balloon navigation in a varying wind field. The approach is evaluated through simulations and indoor and outdoor experiments, demonstrating successful navigation towards target positions with minimal distance errors.

Autonomous ballooning allows for energy-efficient long-range missions but introduces significant challenges for planning and control algorithms, due to their single degree of actuation: vertical rate control through either buoyancy or vertical thrust. Lateral motion is typically due to the wind; thus, balloon flight is both nonholonomic and often stochastic. Finally, wind is very challenging to sense remotely, and estimates are often available only via low-temporal-and-spatial-frequency predictions from large-scale weather models and direct in situ measurements. In this work, reinforcement learning (RL) is used to generate a control policy for an autonomous balloon navigating between 3D positions in a time- and spatially varying wind field. The agent uses its position and velocity, the relative position of the target, and an estimate of the surrounding wind field to command a target altitude. The wind information contains local measurements and an encoding of global wind predictions from a large-scale numerical weather prediction (NWP) model around the current balloon location. The RL algorithm used in this work, the soft actor-critic (SAC), is trained with a reward favoring paths that reach as close as possible to the target, with minimum time and actuation costs. We evaluate our approach first in simulation and then with a controlled indoor experiment, where we generate an artificial wind field and reach a median distance of 23.4 cm from the target within a volume of 3.5 x 3.5 x 3.5 m over 30 trials. Finally, using a fully autonomous custom designed outdoor prototype capable of controlling altitude, long-range communication, redundant localization, and onboard computation, we validate our approach in a real-world setting. Over six flights, the agent navigates to predefined target positions, with an average target distance error of 360 m after traveling approximately 10 km within a volume of 22 x 22 x 3.2 km.

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

Journal

IEEE ROBOTICS & AUTOMATION MAGAZINE

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

Journal

IEEE ROBOTICS & AUTOMATION MAGAZINE

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper