4.6 Article

Hierarchical Reinforcement Learning for Multi-Objective Real-Time Flexible Scheduling in a Smart Shop Floor

Journal

MACHINES
Volume 10, Issue 12, Pages -

Publisher

MDPI
DOI: 10.3390/machines10121195

Keywords

smart shop floor; flexible job shop scheduling; multi-objective; hierarchical reinforcement learning; real-time optimization

Funding

  1. National Science and Technology Special Project of China
  2. [2018ZX04032002]

Ask authors/readers for more resources

With the development of intelligent manufacturing, machine tools play a crucial role in the equipment manufacturing industry. This paper proposes a hierarchical reinforcement learning algorithm to solve the multi-objective dynamic flexible job shop scheduling problem. Experimental results show that the algorithm outperforms others in terms of solution quality and generalization, and it has the advantage of real-time characteristics.
With the development of intelligent manufacturing, machine tools are considered the mothership of the equipment manufacturing industry, and the associated processing workshops are becoming more high-end, flexible, intelligent, and green. As the core of manufacturing management in a smart shop floor, research into the multi-objective dynamic flexible job shop scheduling problem (MODFJSP) focuses on optimizing scheduling decisions in real time according to changes in the production environment. In this paper, hierarchical reinforcement learning (HRL) is proposed to solve the MODFJSP considering random job arrival, with a focus on achieving the two practical goals of minimizing penalties for earliness and tardiness and reducing total machine load. A two-layer hierarchical architecture is proposed, namely the combination of a double deep Q-network (DDQN) and a dueling DDQN (DDDQN), and state features, actions, and external and internal rewards are designed. Meanwhile, a personal computer-based interaction feature is designed to integrate subjective decision information into the real-time optimization of HRL to obtain a satisfactory compromise. In addition, the proposed HRL framework is applied to multi-objective real-time flexible scheduling in a smart gear production workshop, and the experimental results show that the proposed HRL algorithm outperforms other reinforcement learning (RL) algorithms, metaheuristics, and heuristics in terms of solution quality and generalization and has the added benefit of real-time characteristics.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available