4.7 Article

Dynamic selective maintenance optimization for multi-state systems over a finite horizon: A deep reinforcement learning approach

Journal

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
Volume 283, Issue 1, Pages 166-181

Publisher

ELSEVIER
DOI: 10.1016/j.ejor.2019.10.049

Keywords

Maintenance; Dynamic selective maintenance; Deep reinforcement learning; Imperfect maintenance; Multi-state system

Funding

  1. National Natural Science Foundation of China [71771039, 71922006]

Ask authors/readers for more resources

Selective maintenance, which aims to choose a subset of feasible maintenance actions to be performed for a repairable system with limited maintenance resources, has been extensively studied over the past decade. Most of the reported works on selective maintenance have been dedicated to maximizing the success of a single future mission. Cases of multiple consecutive missions, which are oftentimes encountered in engineering practices, have been rarely investigated to date. In this paper, a new selective maintenance optimization for multi-state systems that can execute multiple consecutive missions over a finite horizon is developed. The selective maintenance strategy can be dynamically optimized to maximize the expected number of future mission successes whenever the states and effective ages of the components become known at the end of the last mission. The dynamic optimization problem, which accounts for imperfect maintenance, is formulated as a discrete-time finite-horizon Markov decision process with a mixed integer-discrete-continuous state space. Based on the framework of actor-critic algorithms, a customized deep reinforcement learning method is put forth to overcome the curse of dimensionality and mitigate the uncountable state space. In our proposed method, a postprocess is developed for the actor to search the optimal maintenance actions in a large-scale discrete action space, whereas the techniques of the experience replay and the target network are utilized to facilitate the agent training. The performance of the proposed method is examined by an illustrative example and an engineering example of a coal transportation system. (C) 2019 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available