4.3 Article

Improvements in learning to control perched landings

期刊

AERONAUTICAL JOURNAL
卷 126, 期 1301, 页码 1101-1123

出版社

CAMBRIDGE UNIV PRESS
DOI: 10.1017/aer.2022.48

关键词

UAV; Machine learning; Reinforcement learning; Perching; Agile manoeuvres

资金

  1. Defence Science and Technology Laboratory (DSTL), Ministry of Defence
  2. EPSRC Centre for Doctoral Training in Future Autonomous and Robotic Systems (FARSCOPE) at the Bristol Robotics Laboratory

向作者/读者索取更多资源

This paper investigates the application of reinforcement learning to the problem of controlling a custom sweep-wing aircraft's perched landing manoeuvre. It builds upon previous work by introducing enhancements and modifications to improve performance and reduce error. The study finds that hyperparameter optimization has the most significant impact on increasing reward performance.
Reinforcement learning has previously been applied to the problem of controlling a perched landing manoeuvre for a custom sweep-wing aircraft. Previous work showed that the use of domain randomisation to train with atmospheric disturbances improved the real-world performance of the controllers, leading to increased reward. This paper builds on the previous project, investigating enhancements and modifications to the learning process to further improve performance, and reduce final state error. These changes include modifying the observation by adding information about the airspeed to the standard aircraft state vector, employing further domain randomisation of the simulator, optimising the underlying RL algorithm and network structure, and changing to a continuous action space. Simulated investigations identified hyperparameter optimisation as achieving the most significant increase in reward performance. Several test cases were explored to identify the best combination of enhancements. Flight testing was performed, comparing a baseline model against some of the best performing test cases from simulation. Generally, test cases that performed better than the baseline in simulation also performed better in the real world. However, flight tests also identified limitations with the current numerical model. For some models, the chosen policy performs well in simulation yet stalls prematurely in reality, a problem known as the reality gap.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据