4.5 Article

Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies

Journal

FRONTIERS IN BEHAVIORAL NEUROSCIENCE
Volume 6, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fnbeh.2012.00079

Keywords

reinforcement learning; habit; stimulus-response; action-outcome; nucleus accumbens

Funding

  1. L'Agence Nationale de la Recherche [ANR-11-BSV4-006, ANR-2010-BLAN-0217-04]
  2. HABOT project of the Ville de Paris Emergence(s) program
  3. MRC
  4. European Community [FP6 IST 027819]
  5. MRC [MR/J008648/1] Funding Source: UKRI
  6. Medical Research Council [MR/J008648/1] Funding Source: researchfish

Ask authors/readers for more resources

Behavior in spatial navigation is often organized into map-based (place-driven) vs. map-free (cue-driven) strategies; behavior in operant conditioning research is often organized into goal-directed vs. habitual strategies. Here we attempt to unify the two. We review one powerful theory for distinct forms of learning during instrumental conditioning, namely model-based (maintaining a representation of the world) and model-free (reacting to immediate stimuli) learning algorithms. We extend these lines of argument to propose an alternative taxonomy for spatial navigation, showing how various previously identified strategies can be distinguished as model-based or model-free depending on the usage of information and not on the type of information (e.g., cue vs. place). We argue that identifying model-free learning with dorsolateral striatum and model-based learning with dorsomedial striatum could reconcile numerous conflicting results in the spatial navigation literature. From this perspective, we further propose that the ventral striatum plays key roles in the model building process. We propose that the core of the ventral striatum is positioned to learn the probability of action selection for every transition between states of the world. We further review suggestions that the ventral striatal core and shell are positioned to act as critics contributing to the computation of a reward prediction error for model-free and model-based systems, respectively.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available