4.5 Review

Habits, action sequences and reinforcement learning

期刊

EUROPEAN JOURNAL OF NEUROSCIENCE
卷 35, 期 7, 页码 1036-1051

出版社

WILEY-BLACKWELL
DOI: 10.1111/j.1460-9568.2012.08050.x

关键词

action sequence; goal-directed action; habitual action; reinforcement learning

资金

  1. Australian Research Council [FL0992409]
  2. National Health & Medical Research Council [633267]
  3. National Institute of Mental Health [MH56446]

向作者/读者索取更多资源

It is now widely accepted that instrumental actions can be either goal-directed or habitual; whereas the former are rapidly acquired and regulated by their outcome, the latter are reflexive, elicited by antecedent stimuli rather than their consequences. Model-based reinforcement learning (RL) provides an elegant description of goal-directed action. Through exposure to states, actions and rewards, the agent rapidly constructs a model of the world and can choose an appropriate action based on quite abstract changes in environmental and evaluative demands. This model is powerful but has a problem explaining the development of habitual actions. To account for habits, theorists have argued that another action controller is required, called model-free RL, that does not form a model of the world but rather caches action values within states allowing a state to select an action based on its reward history rather than its consequences. Nevertheless, there are persistent problems with important predictions from the model; most notably the failure of model-free RL correctly to predict the insensitivity of habitual actions to changes in the actionreward contingency. Here, we suggest that introducing model-free RL in instrumental conditioning is unnecessary, and demonstrate that reconceptualizing habits as action sequences allows model-based RL to be applied to both goal-directed and habitual actions in a manner consistent with what real animals do. This approach has significant implications for the way habits are currently investigated and generates new experimental predictions.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据