4.7 Article

Joint optimization of maintenance and quality inspection for manufacturing networks based on deep reinforcement learning

期刊

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.ress.2023.109290

关键词

Manufacturing network; Reliability model; Maintenance; Quality inspection; Optimization; Deep reinforcement learning

向作者/读者索取更多资源

This study proposes a reinforcement learning-based method to solve the joint optimization problem of preventive maintenance and work-in-process quality inspection in manufacturing networks. By introducing dynamic reliability and quality models, and employing the Deep Deterministic Policy Gradient algorithm, the optimal joint control of reliability and quality in manufacturing networks is achieved.
Most existing studies on joint optimization of manufacturing systems (MS) focus on small-scale systems with simple structures, such as the single-machine, simple serial, or parallel MS. Simultaneously, traditional algorithms utilized in small-scale MS always show an insufficiency in solving large-scale dynamic MS with complex structures, such as manufacturing networks. Therefore, considering the effectiveness of reinforcement learning on the infinite-horizon Markov Decision Process (MDP), this paper presents a joint optimization problem of preventive maintenance and work-in-process quality inspection for manufacturing networks with reliabilityquality interactions. First, dynamic reliability and quality models are proposed at the machine level to cope with complex interactions in manufacturing networks. Second, based on the MDP-based optimization model, the proposed Deep Deterministic Policy Gradient (DDPG) algorithm realizes the optimal reliability-quality joint control in manufacturing networks. Besides, it also offers a novel mixed action space containing discrete maintenance and continuous quality inspection, which could satisfy the action diversity in actual production. At last, training and experiments imply our algorithm is more adaptable to diverse manufacturing scenarios than traditional ones. Also, it is proved that more-frequent state observations for learning cannot help the constructed reinforcement learning model get a better control policy because of the information redundancy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据