4.8 Article

A distributional code for value in dopamine based reinforcement learning

Journal

NATURE
Volume 577, Issue 7792, Pages 671-+

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41586-019-1924-6

Keywords

-

Funding

  1. NIGMS NIH HHS [T32 GM007753] Funding Source: Medline
  2. NINDS NIH HHS [R01 NS116753, R01 NS108740] Funding Source: Medline

Ask authors/readers for more resources

Since its introduction, the reward prediction error theory of dopamine has explained a wealth of empirical phenomena, providing a unifying framework for understanding the representation of reward and value in the brain(1-3). According to the now canonical theory, reward predictions are represented as a single scalar quantity, which supports learning about the expectation, or mean, of stochastic outcomes. Here we propose an account of dopamine-based reinforcement learning inspired by recent artificial intelligence research on distributional reinforcement learning(4-6). We hypothesized that the brain represents possible future rewards not as a single mean, but instead as a probability distribution, effectively representing multiple future outcomes simultaneously and in parallel. This idea implies a set of empirical predictions, which we tested using single-unit recordings from mouse ventral tegmental area. Our findings provide strong evidence for a neural realization of distributional reinforcement learning.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available