期刊
OPERATIONS RESEARCH
卷 -, 期 -, 页码 -出版社
INFORMS
DOI: 10.1287/opre.2022.2305
关键词
Markov decision process; average cost; optimality inequality; partial observations; lost sales
This paper investigates a partially observable lost-sales inventory system and proves the existence of a stationary optimal policy for average cost minimization using the vanishing discount factor approach. The key contribution of this study is a method to verify the uniform boundedness of the relative discounted value function, a crucial condition in the vanishing discount factor approach. Additionally, a valid policy is constructed to "copy" the actions of another policy for a process with a different initial state.
We consider a partially observable lost-sales inventory system, in which the inventory level is observed only when it reaches zero. We use the vanishing discount factor approach to prove the existence of a stationary optimal policy for the average cost minimization. As our main methodological contribution, we provide a way to verify the key condition of the vanishing discount factor approach???the uniform boundedness of the relative discounted value function. To accomplish that, we construct a valid policy, which, in a certain sense, ???copies??? the actions of another policy for the process with a different initial state. To the best of our knowledge, this paper is the first one on partially observable inventory models under the average cost criterion.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据