4.6 Article

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

期刊

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/MRA.2023.3271203

关键词

Wind; Navigation; Buoyancy; Aerospace electronics; Encoding; Atmospheric modeling; Training

向作者/读者索取更多资源

Autonomous ballooning presents challenges for planning and control algorithms due to limited control capabilities, the stochastic nature of balloon flight caused by wind, and the difficulty of sensing wind remotely. This study uses reinforcement learning to develop a control policy for autonomous balloon navigation in a varying wind field. The approach is evaluated through simulations and indoor and outdoor experiments, demonstrating successful navigation towards target positions with minimal distance errors.
Autonomous ballooning allows for energy-efficient long-range missions but introduces significant challenges for planning and control algorithms, due to their single degree of actuation: vertical rate control through either buoyancy or vertical thrust. Lateral motion is typically due to the wind; thus, balloon flight is both nonholonomic and often stochastic. Finally, wind is very challenging to sense remotely, and estimates are often available only via low-temporal-and-spatial-frequency predictions from large-scale weather models and direct in situ measurements. In this work, reinforcement learning (RL) is used to generate a control policy for an autonomous balloon navigating between 3D positions in a time- and spatially varying wind field. The agent uses its position and velocity, the relative position of the target, and an estimate of the surrounding wind field to command a target altitude. The wind information contains local measurements and an encoding of global wind predictions from a large-scale numerical weather prediction (NWP) model around the current balloon location. The RL algorithm used in this work, the soft actor-critic (SAC), is trained with a reward favoring paths that reach as close as possible to the target, with minimum time and actuation costs. We evaluate our approach first in simulation and then with a controlled indoor experiment, where we generate an artificial wind field and reach a median distance of 23.4 cm from the target within a volume of 3.5 x 3.5 x 3.5 m over 30 trials. Finally, using a fully autonomous custom designed outdoor prototype capable of controlling altitude, long-range communication, redundant localization, and onboard computation, we validate our approach in a real-world setting. Over six flights, the agent navigates to predefined target positions, with an average target distance error of 360 m after traveling approximately 10 km within a volume of 22 x 22 x 3.2 km.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据