4.7 Article

A Unified Approach to Dynamic Decision Problems With Asymmetric Information: Nonstrategic Agents

期刊

IEEE TRANSACTIONS ON AUTOMATIC CONTROL
卷 67, 期 3, 页码 1105-1119

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TAC.2021.3060835

关键词

Asymmetric information; decision making; dynamic programming; multiagent systems; stochastic systems

资金

  1. NSF [CNS-1238962, CCF-1111061]
  2. ARO-MURI [W911NF-13-1-0421]
  3. ARO [W911NF-17-1-0232]

向作者/读者索取更多资源

We study a class of dynamic multi-agent decision problems with asymmetric information and nonstrategic agents, including signaling phenomenon. By introducing the notion of sufficient information, we propose an information state for each agent that is sufficient for decision-making purposes. We generalize the policy-independence property of belief in partially observed Markov decision processes to dynamic multi-agent decision problems.
We study a general class of dynamic multi- agent decision problems with asymmetric information and nonstrategic agents, which include dynamic teams as a special case. When agents are nonstrategic, an agent's strategy is known to the other agents. Nevertheless, the agents' strategy choices and beliefs are interdependent over times, a phenomenon known as signaling. We introduce the notion of sufficient information that effectively compresses the agents' information in a mutually consistent manner. Based on the notion of sufficient information, we propose an information state for each agent that is sufficient for decision-making purposes. We present instances of dynamic multiagent decision problems where we can determine an information state with a time-invariant domain for each agent. Furthermore, we present a generalization of the policy-independence property of belief in partially observed Markov decision processes (POMDP) to dynamic multiagent decision problems. Within the context of dynamic teams with asymmetric information, the proposed set of information states leads to a sequential decomposition that decouples the interdependence between the agents' strategies and beliefs over time and enables us to formulate a dynamic program to determine a globally optimal policy via backward induction.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据