Journal
NEURAL NETWORKS
Volume 20, Issue 6, Pages 668-675Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2007.04.028
Keywords
dopamine; reinforcement learning; multiple model; timing prediction; classical conditioning
Ask authors/readers for more resources
A number of computational in ode Is have explained the behavior of dopamine, neurons in terms of temporal difference learning. However, earlier models cannot account for recent results of conditioning experiments; specifically, the behavior of dopamine neurons in case of variation of the interval between a cue stimulus and a reward has not been satisfyingly accounted for. We address this problem by using a modular architecture, in which each module consists of a reward predictor and a value estimator. A responsibility signal, computed from the accuracy of the predictions of the reward predictors. is used to weight the contributions and learning of the value estimators. This multiple-model architecture gives an accurate account of the behavior of dopamine neurons in two specific experiments: when the reward is delivered earlier than expected, and when the Stimulus-reward interval varies uniformly over a fixed range. (c) 2007 Elsevier Ltd. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available