4.4 Article

Autonomous obstacle avoidance of UAV based on deep reinforcement learning

期刊

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS
卷 42, 期 4, 页码 3323-3335

出版社

IOS PRESS
DOI: 10.3233/JIFS-211192

关键词

UAV; obstacle avoidance; DQN; overestimation; convergence rate

资金

  1. BeiHang University in Beijing, China
  2. National Natural Science Foundation (NSF) of China [61976014]

向作者/读者索取更多资源

This article investigates obstacle avoidance technology for intelligent unmanned systems, particularly unmanned aerial vehicles (UAVs). Traditional algorithms are ineffective for obstacle avoidance in complex and changing environments with limited UAV sensors, which is addressed using an end-to-end deep reinforcement learning (DRL) algorithm. To overcome slow convergence, a Multi-Branch (MB) network structure is proposed, and to address non-optimal decision-making due to overestimation, the Revise Q-value (RQ) algorithm is introduced. The authors build a V-Rep 3D physical simulation environment tailored to rotor UAVs to test the improved algorithm, demonstrating a 25% increase in average return and accelerated convergence speed.
In the intelligent unmanned systems, unmanned aerial vehicle (UAV) obstacle avoidance technology is the core and primary condition. Traditional algorithms are not suitable for obstacle avoidance in complex and changeable environments based on the limited sensors on UAVs. In this article, we use an end-to-end deep reinforcement learning (DRL) algorithm to achieve the UAV autonomously avoid obstacles. For the problem of slow convergence in DRL, a Multi-Branch (MB) network structure is proposed to ensure that the algorithm can get good performance in the early stage; for non-optimal decision-making problems caused by overestimation, the Revise Q-value (RQ) algorithm is proposed to ensure that the agent can choose the optimal strategy for obstacle avoidance. According to the flying characteristics of the rotor UAV, we build a V-Rep 3D physical simulation environment to test the obstacle avoidance performance. And experiments show that the improved algorithm can accelerate the convergence speed of agent and the average return of the round is increased by 25%.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据