4.3 Article

Learning from delayed feedback: neural responses in temporal credit assignment

期刊

出版社

SPRINGER
DOI: 10.3758/s13415-011-0027-0

关键词

Actor/critic; Credit assignment; Eligibility traces; Event-related potentials; Q-learning; SARSA; Temporal difference learning

资金

  1. NIH [MH 19983]
  2. NIMH [MH068243]
  3. [T32GM081760]

向作者/读者索取更多资源

When feedback follows a sequence of decisions, relationships between actions and outcomes can be difficult to learn. We used event-related potentials (ERPs) to understand how people overcome this temporal credit assignment problem. Participants performed a sequential decision task that required two decisions on each trial. The first decision led to an intermediate state that was predictive of the trial outcome, and the second decision was followed by positive or negative trial feedback. The feedback-related negativity (fERN), a component thought to reflect reward prediction error, followed negative feedback and negative intermediate states. This suggests that participants evaluated intermediate states in terms of expected future reward, and that these evaluations supported learning of earlier actions within sequences. We examine the predictions of several temporal-difference models to determine whether the behavioral and ERP results reflected a reinforcement-learning process.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据