☆ 4.6 Article

Intelligent Decision Making Based on the Combination of Deep Reinforcement Learning and an Influence Map

APPLIED SCIENCES-BASEL (2022)

期刊

APPLIED SCIENCES-BASEL

卷 12, 期 22, 页码 -

出版社

MDPI

DOI: 10.3390/app122211458

关键词

reinforcement learning; deep reinforcement learning; influence map; sparse reward

类别

Chemistry, Multidisciplinary Engineering, Multidisciplinary Materials Science, Multidisciplinary Physics, Applied

资金

National Key R&D Program of China [2020YFB2104700]
National Natural Science Foundation of China [62136006]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In this paper, a method combining dynamic influence map and deep reinforcement learning is proposed to address the issue of sparse reward in intelligent decision making. The experimental results demonstrate the effectiveness of the proposed method in improving scores and training speed, as well as reducing video memory and overall memory consumption.

Almost all recent deep reinforcement learning algorithms use four consecutive frames as the state space to retain the dynamic information. If the training state data constitute an image, the state space is used as the input of the neural network for training. As an AI-assisted decision-making technology, a dynamic influence map can describe dynamic information. In this paper, we propose the use of a frame image superimposed with an influence map as the state space to express dynamic information. Herein, we optimize Ape-x as a distributed reinforcement learning algorithm. Sparse reward is an issue that must be solved in refined intelligent decision making. The use of an influence map is proposed to generate the intrinsic reward when there is no external reward. The experiments conducted in this study prove that the combination of a dynamic influence map and deep reinforcement learning is effective. Compared with the traditional method that uses four consecutive frames to represent dynamic information, the score of the proposed method is increased by 11-13%, the training speed is increased by 59%, the video memory consumption is reduced by 30%, and the memory consumption is reduced by 50%. The proposed method is compared with the Ape-x algorithm without an influence map, DQN, N-Step DQN, QR-DQN, Dueling DQN, and C51. The experimental results show that the final score of the proposed method is higher than that of the compared baseline methods. In addition, the influence map is used to generate an intrinsic reward to effectively resolve the sparse reward problem.

Intelligent Decision Making Based on the Combination of Deep Reinforcement Learning and an Influence Map

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Intelligent Decision Making Based on the Combination of Deep Reinforcement Learning and an Influence Map

期刊

APPLIED SCIENCES-BASEL

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文