☆ 4.7 Article

Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values

JOURNAL OF NEUROSCIENCE (2009)

期刊

JOURNAL OF NEUROSCIENCE

卷 29, 期 43, 页码 13524-13531

出版社

SOC NEUROSCIENCE

DOI: 10.1523/JNEUROSCI.2469-09.2009

关键词

类别

Neurosciences

资金

National Institute of Mental Health [R01-MH087882]
Burroughs-Wellcome Fund
Sloan Foundation
McKnight Endowment Fund for Neuroscience
New York State Foundation for Science, Technology, and Innovation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Humans and animals are endowed with a large number of effectors. Although this enables great behavioral flexibility, it presents an equally formidable reinforcement learning problem of discovering which actions are most valuable because of the high dimensionality of the action space. An unresolved question is how neural systems for reinforcement learning-such as prediction error signals for action valuation associated with dopamine and the striatum-can cope with this curse of dimensionality. We propose a reinforcement learning framework that allows for learned action valuations to be decomposed into effector-specific components when appropriate to a task, and test it by studying to what extent human behavior and blood oxygen level-dependent (BOLD) activity can exploit such a decomposition in a multieffector choice task. Subjects made simultaneous decisions with their left and right hands and received separate reward feedback for each hand movement. We found that choice behavior was better described by a learning model that decomposed the values of bimanual movements into separate values for each effector, rather than a traditional model that treated the bimanual actions as unitary with a single value. A decomposition of value into effector-specific components was also observed in value-related BOLD signaling, in the form of lateralized biases in striatal correlates of prediction error and anticipatory value correlates in the intraparietal sulcus. These results suggest that the human brain can use decomposed value representations to divide and conquer reinforcement learning over high-dimensional action spaces.

Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values

期刊

JOURNAL OF NEUROSCIENCE

出版社

SOC NEUROSCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Human Reinforcement Learning Subdivides Structured Action Spaces by Learning Effector-Specific Values

期刊

JOURNAL OF NEUROSCIENCE

出版社

SOC NEUROSCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文