☆ 4.7 Article

Markov decision processes with delays and asynchronous cost collection

IEEE TRANSACTIONS ON AUTOMATIC CONTROL (2003)

期刊

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

卷 48, 期 4, 页码 568-574

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TAC.2003.809799

关键词

asynchrony; delays; Markov decision processes (MDP's); neuro-dynamic programming

类别

Automation & Control Systems Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Markov decision processes (MDPs) may involve three types of delays. First, state information, rather than being available instantaneously, may arrive with a delay (observation delay). Second, an action may take effect at a later decision stage rather than immediately (action delay). Third, the cost induced by an action may be collected after a number of stages (cost delay). We derive two results, one for constant and one for random delays, for reducing an MDP with delays to an MDP without delays, which, differs only in the size of the state space. The results are based on the intuition that costs may be collected asynchronously, i.e., at a stage other than the one in which they are induced, as long as they are discounted properly.

Markov decision processes with delays and asynchronous cost collection

期刊

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Markov decision processes with delays and asynchronous cost collection

期刊

IEEE TRANSACTIONS ON AUTOMATIC CONTROL

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文