4.2 Article

PVLV: The primary value and learned value Pavlovian learning algorithm

Journal

BEHAVIORAL NEUROSCIENCE
Volume 121, Issue 1, Pages 31-49

Publisher

AMER PSYCHOLOGICAL ASSOC
DOI: 10.1037/0735-7044.121.1.31

Keywords

basal ganglia; dopamine; reinforcement learning; Pavlovian conditioning; computational; modeling

Ask authors/readers for more resources

The authors present their primary value learned value (PVLV) model for understanding the reward-predictive firing properties of dopamine (DA) neurons as an alternative to the temporal-differences (TD) algorithm. PVLV is more directly related to underlying biology and is also more robust to variability in the environment. The primary value (PV) system controls performance and learning during primary rewards, whereas the learned value (LV) system learns about conditioned stimuli. The PV system is essentially the Rescorla-Wagner/delta-rule and comprises the neurons in the ventral striatum/nucleus accumbens that inhibit DA cells. The LV system comprises the neurons in the central nucleus of the amygdala that excite DA cells. The authors show that the PVLV model can account for critical aspects of the DA firing data, making a number of clear predictions about lesion effects, several of which are consistent with existing data. For example, first- and second-order conditioning can be anatomically dissociated, which is consistent with PVLV and not TD. Overall, the model provides a biologically plausible framework for understanding the neural basis of reward learning.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.2
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available