4.7 Article

Dopamine reward prediction errors reflect hidden-state inference across time

期刊

NATURE NEUROSCIENCE
卷 20, 期 4, 页码 581-+

出版社

NATURE PUBLISHING GROUP
DOI: 10.1038/nn.4520

关键词

-

资金

  1. National Science Foundation grant CRCNS [1207833]
  2. US National Institutes of Health [R01MH095953, R01MH101207, T32 MH020017, T32GM007753]
  3. Harvard Brain Science Initiative Seed grant (N.U.), a Harvard Mind Brain and Behavior faculty
  4. Fondation pour la Recherche Medicale [SPE20150331860]
  5. Div Of Information & Intelligent Systems
  6. Direct For Computer & Info Scie & Enginr [1207833] Funding Source: National Science Foundation

向作者/读者索取更多资源

Midbrain dopamine neurons signal reward prediction error (RPE), or actual minus expected reward. The temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead compute a value signal based on an inferred distribution of hidden states (a `belief state'). Here we asked whether dopaminergic signaling supports a TD learning framework that operates over hidden states. We found that dopamine signaling showed a notable difference between two tasks that differed only with respect to whether reward was delivered in a deterministic manner. Our results favor an associative learning rule that combines cached values with hidden-state inference.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据