☆ 4.6 Article

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

IEEE ROBOTICS & AUTOMATION MAGAZINE (2023)

期刊

IEEE ROBOTICS & AUTOMATION MAGAZINE

卷 -, 期 -, 页码 -

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/MRA.2023.3271203

关键词

Wind; Navigation; Buoyancy; Aerospace electronics; Encoding; Atmospheric modeling; Training

类别

Automation & Control Systems Robotics

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Autonomous ballooning presents challenges for planning and control algorithms due to limited control capabilities, the stochastic nature of balloon flight caused by wind, and the difficulty of sensing wind remotely. This study uses reinforcement learning to develop a control policy for autonomous balloon navigation in a varying wind field. The approach is evaluated through simulations and indoor and outdoor experiments, demonstrating successful navigation towards target positions with minimal distance errors.

Autonomous ballooning allows for energy-efficient long-range missions but introduces significant challenges for planning and control algorithms, due to their single degree of actuation: vertical rate control through either buoyancy or vertical thrust. Lateral motion is typically due to the wind; thus, balloon flight is both nonholonomic and often stochastic. Finally, wind is very challenging to sense remotely, and estimates are often available only via low-temporal-and-spatial-frequency predictions from large-scale weather models and direct in situ measurements. In this work, reinforcement learning (RL) is used to generate a control policy for an autonomous balloon navigating between 3D positions in a time- and spatially varying wind field. The agent uses its position and velocity, the relative position of the target, and an estimate of the surrounding wind field to command a target altitude. The wind information contains local measurements and an encoding of global wind predictions from a large-scale numerical weather prediction (NWP) model around the current balloon location. The RL algorithm used in this work, the soft actor-critic (SAC), is trained with a reward favoring paths that reach as close as possible to the target, with minimum time and actuation costs. We evaluate our approach first in simulation and then with a controlled indoor experiment, where we generate an artificial wind field and reach a median distance of 23.4 cm from the target within a volume of 3.5 x 3.5 x 3.5 m over 30 trials. Finally, using a fully autonomous custom designed outdoor prototype capable of controlling altitude, long-range communication, redundant localization, and onboard computation, we validate our approach in a real-world setting. Over six flights, the agent navigates to predefined target positions, with an average target distance error of 360 m after traveling approximately 10 km within a volume of 22 x 22 x 3.2 km.

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

期刊

IEEE ROBOTICS & AUTOMATION MAGAZINE

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reinforcement Learning for Outdoor Balloon Navigation: A Successful Controller for an Autonomous Balloon

期刊

IEEE ROBOTICS & AUTOMATION MAGAZINE

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文