☆ 4.7 Article

QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus plus Innovations

IEEE TRANSACTIONS ON SIGNAL PROCESSING (2013)

Journal

IEEE TRANSACTIONS ON SIGNAL PROCESSING

Volume 61, Issue 7, Pages 1848-1862

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TSP.2013.2241057

Keywords

Collaborative network processing; consensus plus innovations; distributed Q-learning; mixed time-scale dynamics; multi-agent stochastic control; reinforcement learning

Funding

National Science Foundation [CCF-1011903, DMS-1118605]
Air Force Office of Scientific Research [FA-95501010291]
Division of Computing and Communication Foundations
Direct For Computer & Info Scie & Enginr [1018509] Funding Source: National Science Foundation
Division Of Mathematical Sciences
Direct For Mathematical & Physical Scien [1118605] Funding Source: National Science Foundation

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

The paper develops QD-learning, a distributed version of reinforcement Q-learning, for multi-agent Markov decision processes (MDPs); the agents have no prior information on the global state transition and on the local agent cost statistics. The network agents minimize a network-averaged infinite horizon discounted cost, by local processing and by collaborating through mutual information exchange over a sparse (possibly stochastic) communication network. The agents respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. When each agent is aware only of its local online cost data and the interagent communication network is weakly connected, we prove that QD-learning, a consensus+innovations algorithm with mixed time-scale stochastic dynamics, converges asymptotically almost surely to the desired value function and to the optimal stationary control policy at each network agent.

QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus plus Innovations

Journal

IEEE TRANSACTIONS ON SIGNAL PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus plus Innovations

Journal

IEEE TRANSACTIONS ON SIGNAL PROCESSING

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper