4.7 Article

A dynamic mission abort policy for the swarm executing missions and its solution method by tailored deep reinforcement learning

Journal

RELIABILITY ENGINEERING & SYSTEM SAFETY
Volume 234, Issue -, Pages -

Publisher

ELSEVIER SCI LTD
DOI: 10.1016/j.ress.2023.109149

Keywords

Mission abort; Survivability; Swarm mission reliability; Markov decision process; Deep reinforcement learning

Ask authors/readers for more resources

Mission abort is an effective action to prevent catastrophic accidents and improve the survivability of safety-critical systems. Existing research mainly focuses on single equipment abort policies and neglects the swarm aspect. The complexity of swarm operations and the curse of dimensionality necessitate a dynamic mission abort policy for both equipment and swarm levels. A deep reinforcement learning approach with an action mask method is proposed to optimize the policy and overcome the curse of dimensionality.
The mission abort is an effective action to avoid catastrophic accidents and enhance the survivability of safety -critical systems such as unmanned aerial vehicle swarms and submarine swarms. Existing researches mainly focus on the abort policy of single equipment and lack consideration for the swarm. Furthermore, the mission abort of equipment in the swarm has operation dependence, and the state space of a swarm is far larger than that of single equipment, which leads to a curse of dimensionality. To solve the above problems, a dynamic mission abort policy is developed for the swarm to specify mission abort policies for both the equipment level and swarm level. First, considering both the degradation level, operating state of equipment, and time in the mission, a dynamic mission abort policy is proposed for the swarm with changing states. Next, the mission abort problem is formulated as a Markov decision process to maximize the expected cumulative reward of the swarm. Then, to overcome the curse of dimensionality, a deep reinforcement learning approach is tailored to optimize the pro-posed policy, where an action mask method is adopted to filter out infeasible actions. Finally, a case study is presented to illustrate the superiority of the proposed approach.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available