4.7 Article

Brain mechanism of reward prediction under predictable and unpredictable environmental dynamics

期刊

NEURAL NETWORKS
卷 19, 期 8, 页码 1233-1241

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2006.05.039

关键词

reinforcement learning model; Markov decision problem; fMRI; human; Bayesian estimation

向作者/读者索取更多资源

In learning goal-directed behaviors, an agent has to consider not only the reward given at each state but also the consequences of dynamic state transitions associated with action selection. To understand brain mechanisms for action learning under predictable and unpredictable environmental dynamics, we measured brain activities by functional magnetic resonance imaging (fMRI) during a Markov decision task with predictable and unpredictable state transitions. Whereas the striatum and orbitofrontal cortex (OFC) were significantly activated both under predictable and unpredictable state transition rules, the dorsolateral prefrontal cortex (DLPFC) was more strongly activated under predictable than under unpredictable state transition rules. We then modelled subjects' choice behaviours using a reinforcement learning model and a Bayesian estimation framework and found that the subjects took larger temporal discount factors under predictable state transition rules. Model-based analysis of fMRI data revealed different engagement of striatum in reward prediction under different state transition dynamics. The ventral striatum was involved in reward prediction under both unpredictable and predictable state transition rules, although the dorsal striatum was dominantly involved in reward prediction under predictable rules. These results suggest different learning systems in the cortico-striatum loops depending on the dynamics of the environment: the OFC-ventral striatum loop is involved in action learning based on the present state, while the DLPFC-dorsal striatum loop is involved in action learning based on predictable future states. (c) 2006 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据