☆ 4.4 Article

Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY (2021)

期刊

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY

卷 36, 期 5, 页码 1002-1021

出版社

SPRINGER SINGAPORE PTE LTD

DOI: 10.1007/s11390-021-1217-z

关键词

robustness assessment; skewness; sparseness; asynchronous advantage actor-critic; reinforcement learning

类别

Computer Science, Hardware & Architecture Computer Science, Software Engineering

资金

National Natural Science Foundation of China [61972025, 61802389, 61672092, U1811264, 61966009]
National Key Research and Development Program of China [2020YFB1005604, 2020YFB2103802]
Guangxi Key Laboratory of Trusted Software [KX201902]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This paper conducts the first robustness assessment of A3C based on parallel computing, proposing static and dynamic methods to measure robustness. Experimental results demonstrate that the proposed robustness assessment can effectively gauge the robustness of A3C with an accuracy of 83.3%.

Reinforcement learning as autonomous learning is greatly driving artificial intelligence (AI) development to practical applications. Having demonstrated the potential to significantly improve synchronously parallel learning, the parallel computing based asynchronous advantage actor-critic (A3C) opens a new door for reinforcement learning. Unfortunately, the acceleration's inuence on A3C robustness has been largely overlooked. In this paper, we perform the first robustness assessment of A3C based on parallel computing. By perceiving the policy's action, we construct a global matrix of action probability deviation and define two novel measures of skewness and sparseness to form an integral robustness measure. Based on such static assessment, we then develop a dynamic robustness assessing algorithm through situational whole-space state sampling of changing episodes. Extensive experiments with different combinations of agent number and learning rate are implemented on an A3C-based pathfinding application, demonstrating that our proposed robustness assessment can effectively measure the robustness of A3C, which can achieve an accuracy of 83.3%.

Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

期刊

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY

出版社

SPRINGER SINGAPORE PTE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Robustness Assessment of Asynchronous Advantage Actor-Critic Based on Dynamic Skewness and Sparseness Computation: A Parallel Computing View

期刊

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY

出版社

SPRINGER SINGAPORE PTE LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文