☆ 4.7 Article

A policy-based Monte Carlo tree search method for container pre-marshalling

INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH (2023)

Journal

INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH

Volume -, Issue -, Pages -

Publisher

TAYLOR & FRANCIS LTD

DOI: 10.1080/00207543.2023.2279130

Keywords

Container pre-marshalling problem; Monte Carlo tree search; Markov decision process; Q-learning algorithm; Automated container terminal

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes an improved policy-based Monte Carlo tree search (P-MCTS) algorithm to solve the container pre-marshalling problem (CPMP). The CPMP is formulated as a Markov decision process (MDP) model to consider the sequential nature of the problem. The P-MCTS algorithm utilizes eight composite reshuffling rules and modified upper confidence bounds in the selection phase, and a well-designed heuristic algorithm in the simulation phase. Experimental results show that the P-MCTS outperforms all compared methods in scenarios with different priorities and scenarios where containers can share the same priority.

The container pre-marshalling problem (CPMP) aims to minimise the number of reshuffling moves, ultimately achieving an optimised stacking arrangement in each bay based on the priority of containers during the non-loading phase. Given the sequential decision nature, we formulated the CPMP as a Markov decision process (MDP) model to account for the specific state and action of the reshuffling process. To address the challenge that the relocated container may trigger a chain effect on the subsequent reshuffling moves, this paper develops an improved policy-based Monte Carlo tree search (P-MCTS) to solve the CPMP, where eight composite reshuffling rules and modified upper confidence bounds are employed in the selection phases, and a well-designed heuristic algorithm is utilised in the simulation phases. Meanwhile, considering the effectiveness of reinforcement learning methods for solving the MDP model, an improved Q-learning is proposed as the compared method. Numerical results show that the P-MCTS outperforms all compared methods in scenarios where all containers have different priorities and scenarios where containers can share the same priority.

A policy-based Monte Carlo tree search method for container pre-marshalling

Journal

INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH

Publisher

TAYLOR & FRANCIS LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A policy-based Monte Carlo tree search method for container pre-marshalling

Journal

INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH

Publisher

TAYLOR & FRANCIS LTD

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper