4.7 Article

Data-Efficient Deep Reinforcement Learning for Attitude Control of Fixed-Wing UAVs: Field Experiments

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2023.3263430

关键词

Attitude control; Reinforcement learning; Data models; Computational modeling; Autonomous aerial vehicles; Aircraft; Vehicle dynamics; autonomous aerial vehicles; deep reinforcement learning (DRL); sim-to-real; soft actor-critic (SAC)

向作者/读者索取更多资源

This article discusses the difficult problem of attitude control in fixed-wing UAVs and proposes the use of deep reinforcement learning to handle the nonlinear dynamics. The results show that DRL can successfully learn to perform attitude control with just 3 minutes of flight data. The learned controller performs comparably to the state-of-the-art PID controller without further online learning.
Attitude control of fixed-wing unmanned aerial vehicles (UAVs) is a difficult control problem in part due to uncertain nonlinear dynamics, actuator constraints, and coupled longitudinal and lateral motions. Current state-of-the-art autopilots are based on linear control and are thus limited in their effectiveness and performance. drl is a machine learning method to automatically discover optimal control laws through interaction with the controlled system that can handle complex nonlinear dynamics. We show in this article that deep reinforcement learning (DRL) can successfully learn to perform attitude control of a fixed-wing UAV operating directly on the original nonlinear dynamics, requiring as little as 3 min of flight data. We initially train our model in a simulation environment and then deploy the learned controller on the UAV in flight tests, demonstrating comparable performance to the state-of-the-art ArduPlane proportional-integral-derivative (PID) attitude controller with no further online learning required. Learning with significant actuation delay and diversified simulated dynamics were found to be crucial for successful transfer to control of the real UAV. In addition to a qualitative comparison with the ArduPlane autopilot, we present a quantitative assessment based on linear analysis to better understand the learning controller's behavior.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据