☆ 4.6 Article

Cooperative behavior of a heterogeneous robot team for planetary exploration using deep reinforcement learning

ACTA ASTRONAUTICA (2024)

期刊

ACTA ASTRONAUTICA

卷 214, 期 -, 页码 689-700

出版社

PERGAMON-ELSEVIER SCIENCE LTD

DOI: 10.1016/j.actaastro.2023.11.014

关键词

Robot swarm; Heterogeneous robot team; Multi-robot cooperation; Planetary exploration; Reinforcement learning; Team behavior; Decentralized multi-agent control

类别

Engineering, Aerospace

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

As humans continue to explore the surfaces of the Moon and Mars, the use of distributed heterogeneous robot teams can increase the chances of success by utilizing the complementary capabilities and synergy of the team members. Effective cooperation and collaboration between the members of a robot team is crucial, but defining a metric for effective cooperation is challenging. This paper presents a method for determining reward criteria that can be used for training robot swarm through reinforcement learning techniques. The trained robot team exhibits high success rates and cooperative behavior in test environments, demonstrating the robustness and scalability of the training strategies.

As we continue exploration of the Lunar and Martian surfaces and push farther into unknown and unstructured environments, employing a team of distributed heterogeneous robots will increase the odds of success by enabling more complex task planning that utilizes the complementary capabilities and synergy of the team members. A requirement to reap these potential benefits is effective cooperation and collaboration between the members of a robot team. Defining a metric for effective cooperation is a difficult task that will depend on the composition of the team, the task to be performed, and the working environment. This paper establishes a method for determining the reward criteria (figures of merit) that can be used for training the robot swarm through reinforcement learning techniques. A hierarchical framework of rewards is used which, at the lowest level, measures the success of an individual robot in performing its task. The success of all robots performing different subtasks is then measured using the Quantified Cooperation Assessment (QCA) metric which was introduced in our previous research of multi-robot collaboration. Finally, the mission-level success and overall reward is determined by weighting each task using its priority within the overall mission context. A common reward for each of the robotic teammates is then applied within the learning process, which emphasizes group performance over that of an individual and encourages cooperative behavior. This cooperation framework is trained in a grid-based environment representing an exploration mission on a planetary surface by a heterogenous team of robots consisting of a landing craft, a traditional rover, and several small agile robots. The robots are trained concurrently, but with individual policies developed for each agent resulting in a decentralized control scheme. Once trained, the control policies were evaluated in several test environments consisting of novel terrain maps and regions of interest. An average success rate of 90 % was seen in the test environments demonstrating the robustness of the trained policies. The robots have been trained to not only complete the mission but also perform it in a cooperative manner as well as show the scalable and resilient behavior.

Cooperative behavior of a heterogeneous robot team for planetary exploration using deep reinforcement learning

期刊

ACTA ASTRONAUTICA

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Cooperative behavior of a heterogeneous robot team for planetary exploration using deep reinforcement learning

期刊

ACTA ASTRONAUTICA

出版社

PERGAMON-ELSEVIER SCIENCE LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文