☆ 4.7 Article

Deep reinforcement learning-based safe interaction for industrial human-robot collaboration using intrinsic reward function

ADVANCED ENGINEERING INFORMATICS (2021)

Journal

ADVANCED ENGINEERING INFORMATICS

Volume 49, Issue -, Pages -

Publisher

ELSEVIER SCI LTD

DOI: 10.1016/j.aei.2021.101360

Keywords

Industrial human-robot collaboration; Collision avoidance; Deep reinforcement learning; Intrinsic reward function

Funding

National Natural Science Foundation of China [51775399, 51675389]
Fundamental Research Funds for the Central Universities [WUT: 2020III047]
International Science and Technology Innovation Cooperation Project of Sichuan Province [20GJHZ0039]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper introduces a deep reinforcement learning approach for real-time collision-free motion planning of an industrial robot, aiming to ensure operator safety in human-robot collaboration in manufacturing. By optimizing the reward function and combining the DDPG algorithm, the proposed IRDDPG algorithm allows the robot to learn an expected collision avoidance policy effectively in a simulation environment.

Aiming at human-robot collaboration in manufacturing, the operator's safety is the primary issue during the manufacturing operations. This paper presents a deep reinforcement learning approach to realize the real-time collision-free motion planning of an industrial robot for human-robot collaboration. Firstly, the safe human robot collaboration manufacturing problem is formulated into a Markov decision process, and the mathematical expression of the reward function design problem is given. The goal is that the robot can autonomously learn a policy to reduce the accumulated risk and assure the task completion time during human-robot collaboration. To transform our optimization object into a reward function to guide the robot to learn the expected behaviour, a reward function optimizing approach based on the deterministic policy gradient is proposed to learn a parameterized intrinsic reward function. The reward function for the agent to learn the policy is the sum of the intrinsic reward function and the extrinsic reward function. Then, a deep reinforcement learning algorithm intrinsic reward-deep deterministic policy gradient (IRDDPG), which is the combination of the DDPG algorithm and the reward function optimizing approach, is proposed to learn the expected collision avoidance policy. Finally, the proposed algorithm is tested in a simulation environment, and the results show that the industrial robot can learn the expected policy to achieve the safety assurance for industrial human-robot collaboration without missing the original target. Moreover, the reward function optimizing approach can help make up for the designed reward function and improve policy performance.

Deep reinforcement learning-based safe interaction for industrial human-robot collaboration using intrinsic reward function

Journal

ADVANCED ENGINEERING INFORMATICS

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Deep reinforcement learning-based safe interaction for industrial human-robot collaboration using intrinsic reward function

Journal

ADVANCED ENGINEERING INFORMATICS

Publisher

ELSEVIER SCI LTD

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper