4.5 Article

A Spiking Neural Network Model of an Actor-Critic Learning Agent

Journal

NEURAL COMPUTATION
Volume 21, Issue 2, Pages 301-339

Publisher

MIT PRESS
DOI: 10.1162/neco.2008.08-07-593

Keywords

-

Funding

  1. DIP [F1.2]
  2. BMBF [01GQ0420]
  3. EU [15879]

Ask authors/readers for more resources

The ability to adapt behavior to maximize reward as a result of interactions with the environment is crucial for the survival of any higher organism. In the framework of reinforcement learning, temporal-difference learning algorithms provide an effective strategy for such goal-directed adaptation, but it is unclear to what extent these algorithms are compatible with neural computation. In this article, we present a spiking neural network model that implements actor-critic temporal-difference learning by combining local plasticity rules with a global reward signal. The network is capable of solving a nontrivial gridworld task with sparse rewards. We derive a quantitative mapping of plasticity parameters and synaptic weights to the corresponding variables in the standard algorithmic formulation and demonstrate that the network learns with a similar speed to its discrete time counterpart and attains the same equilibrium performance.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available