4.5 Article

Short-term memory traces for action bias in human reinforcement learning

期刊

BRAIN RESEARCH
卷 1153, 期 -, 页码 111-121

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/j.brainres.2007.03.057

关键词

reinforcement learning; eligibility traces; dopamine

资金

  1. NIDA NIH HHS [DA-11723] Funding Source: Medline
  2. NIMH NIH HHS [P50 MH62196, F32 MH072141] Funding Source: Medline
  3. NINDS NIH HHS [NS-045790] Funding Source: Medline
  4. Engineering and Physical Sciences Research Council [EP/C516303/1] Funding Source: researchfish

向作者/读者索取更多资源

Recent experimental and theoretical work on reinforcement learning has shed light on the neural bases of learning from rewards and punishments. one fundamental problem in reinforcement learning is the credit assignment problem, or how to properly assign credit to actions that lead to reward or punishment following a delay. Temporal difference learning solves this problem, but its efficiency can be significantly improved by the addition of eligibility traces (ET). In essence, ETs function as decaying memories of previous choices that are used to scale synaptic weight changes. It has been shown in theoretical studies that ETs spanning a number of actions may improve the performance of reinforcement learning. However, it remains an open question whether including ETs that persist over sequences of actions allows reinforcement learning models to better fit empirical data regarding the behaviors of humans and other animals. Here, we report an experiment in which human subjects performed a sequential economic decision game in which the long-term optimal strategy differed from the strategy that leads to the greatest short-term return. We demonstrate that human subjects' performance in the task is significantly affected by the time between choices in a surprising and seemingly counterintuitive way. However, this behavior is naturally explained by a temporal difference learning model which includes ETs persisting across actions. Furthermore, we review recent findings that suggest that shortterm synaptic plasticity in dopamine neurons may provide a realistic biophysical mechanism for producing ETs that persist on a timescale consistent with behavioral observations. (c) 2007 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据