4.6 Article

Efficient Deep Reinforcement Learning for Optimal Path Planning

期刊

ELECTRONICS
卷 11, 期 21, 页码 -

出版社

MDPI
DOI: 10.3390/electronics11213628

关键词

deep reinforcement learning; global optimal path planning; dynamic programming; mobile robots; shortest path; continuous state space; collision avoidance

资金

  1. Natural Sciences and Engineering Research Council of Canada (NSERC) [210471]

向作者/读者索取更多资源

This paper proposes a novel deep reinforcement learning method for optimal path planning for mobile robots. By combining dynamic programming with deep reinforcement learning, the method overcomes the issues of slow learning process and poor training data quality, and achieves promising experimental results.
In this paper, we propose a novel deep reinforcement learning (DRL) method for optimal path planning for mobile robots using dynamic programming (DP)-based data collection. The proposed method can overcome the slow learning process and improve training data quality inherently in DRL algorithms. The main idea of our approach is as follows. First, we mapped the dynamic programming method to typical optimal path planning problems for mobile robots, and created a new efficient DP-based method to find an exact, analytical, optimal solution for the path planning problem. Then, we used high-quality training data gathered using the DP method for DRL, which greatly improves training data quality and learning efficiency. Next, we established a two-stage reinforcement learning method where, prior to the DRL, we employed extreme learning machines (ELM) to initialize the parameters of actor and critic neural networks to a near-optimal solution in order to significantly improve the learning performance. Finally, we illustrated our method using some typical path planning tasks. The experimental results show that our DRL method can converge much easier and faster than other methods. The resulting action neural network is able to successfully guide robots from any start position in the environment to the goal position while following the optimal path and avoiding collision with obstacles.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据