☆ 4.6 Article

Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO

APPLIED SCIENCES-BASEL (2022)

期刊

APPLIED SCIENCES-BASEL

卷 12, 期 20, 页码 -

出版社

MDPI

DOI: 10.3390/app122010230

关键词

autonomous air combat; maneuver decision-making; reinforcement learning; final reward estimation; proximal policy optimization

类别

Chemistry, Multidisciplinary Engineering, Multidisciplinary Materials Science, Multidisciplinary Physics, Applied

资金

National Natural Science Foundation of China [62101590]
Natural Science Foundation of Shaanxi Province [2020JQ-481]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper proposes an air combat maneuver decision-making method based on final reward estimation and proximal policy optimization. By constructing an air combat environment, designing reward mechanisms, and improving the training performance and efficiency of reinforcement learning, it provides a solution to maneuver decision-making problems in autonomous air combat.

Maneuver decision-making is the core of autonomous air combat, and reinforcement learning is a potential and ideal approach for addressing decision-making problems. However, when reinforcement learning is used for maneuver decision-making for autonomous air combat, it often suffers from awful training efficiency and poor performance of maneuver decision-making. In this paper, an air combat maneuver decision-making method based on final reward estimation and proximal policy optimization is proposed to solve the above problems. First, an air combat environment based on aircraft and missile models is constructed, and an intermediate reward and final reward are designed. Second, the final reward estimation is proposed to replace the original advantage estimation function of the surrogate objective of proximal policy optimization to improve the training performance of reinforcement learning. Third, sampling according to the final reward estimation is proposed to improve the training efficiency. Finally, the proposed method is used in a self-play framework to train agents for maneuver decision-making. Simulations show that final reward estimation and sampling according to final reward estimation are effective and efficient.

Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Maneuver Decision-Making for Autonomous Air Combat Based on FRE-PPO

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文