Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

Article Computer Science, Interdisciplinary Applications

Use of Proximal Policy Optimization for the Joint Replenishment Problem

Nathalie Vanvuchelen et al.

COMPUTERS IN INDUSTRY (2020)

Add to Collection

Article Multidisciplinary Sciences

Mastering Atari, Go, chess and shogi by planning with a learned model

Julian Schrittwieser et al.

NATURE (2020)

Add to Collection

Article Engineering, Industrial

Improved ordering of perishables: The value of stock-age information

Rene Haijema et al.

INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS (2019)

Add to Collection

Article Multidisciplinary Sciences

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

David Silver et al.

SCIENCE (2018)

Add to Collection

Article Multidisciplinary Sciences

Mastering the game of Go without human knowledge

David Silver et al.

NATURE (2017)

Add to Collection

Article Multidisciplinary Sciences

Mastering the game of Go with deep neural networks and tree search

David Silver et al.

NATURE (2016)

Add to Collection

Article Multidisciplinary Sciences

Human-level control through deep reinforcement learning

Volodymyr Mnih et al.

NATURE (2015)

Add to Collection

Article Engineering, Industrial

A new class of stock-level dependent ordering policies for perishables with a short maximum shelf life

Rene Haijema

INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS (2013)

Add to Collection

Article Engineering, Industrial

A new age-based replenishment policy for supply chain inventory optimization of highly perishable products

Qinglin Duan et al.

INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS (2013)

Add to Collection

Article Mathematics, Interdisciplinary Applications

AN EMPIRICAL STUDY OF POTENTIAL-BASED REWARD SHAPING AND ADVICE IN COMPLEX, MULTI-AGENT SYSTEMS

Sam Devlin et al.

ADVANCES IN COMPLEX SYSTEMS (2011)

Add to Collection

Article Computer Science, Interdisciplinary Applications

A heuristic to manage perishable inventory with batch ordering, positive lead-times, and time-varying demand

Rob A. C. M. Broekmeulen et al.

COMPUTERS & OPERATIONS RESEARCH (2009)

Add to Collection

Article Computer Science, Interdisciplinary Applications

Blood platelet production: Optimization by dynamic programming and simulation

Rene Haijema et al.

COMPUTERS & OPERATIONS RESEARCH (2007)

Add to Collection

Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management

Related references

Export Citation

Share Paper