4.3 Article

A Dual Role Hypothesis of the Cortico-Basal-Ganglia Pathways: Opponency and Temporal Difference Through Dopamine and Adenosine

期刊

FRONTIERS IN NEURAL CIRCUITS
卷 12, 期 -, 页码 -

出版社

FRONTIERS MEDIA SA
DOI: 10.3389/fncir.2018.00111

关键词

reinforcement learning; reward prediction error; cost; basal ganglia; dopamine; adenosine

资金

  1. Ministry of Education, Culture, Sports, Science and Technology in Japan [15H05876, 17H06311]
  2. Grants-in-Aid for Scientific Research [17H06311, 15H05876] Funding Source: KAKEN

向作者/读者索取更多资源

The hypothesis that the basal-ganglia direct and indirect pathways represent goodness (or benefit) and badness (or cost) of options, respectively, explains a wide range of phenomena. However, this hypothesis, named the Opponent Actor Learning (OpAL), still has limitations. Structurally, the OpAL model does not incorporate differentiation of the two types of cortical inputs to the basal-ganglia pathways received from intratelencephalic (IT) and pyramidal-tract (PT) neurons. Functionally, the OpAL model does not describe the temporal-difference (TD)-type reward-prediction-error (RPE), nor explains how RPE is calculated in the circuitry connecting to the DA neurons. In fact, there is a different hypothesis on the basal-ganglia pathways and DA, named the Cortico-Striatal-Temporal-Difference (CS-TD) model. The CS-TD model differentiates the IT and PT inputs, describes the TD-type RPE, and explains how TD-RPE is calculated. However, a critical difficulty in this model lies in its assumption that DA induces the same direction of plasticity in both direct and indirect pathways, which apparently contradicts the experimentally observed opposite effects of DA on these pathways. Here, we propose a new hypothesis that integrates the OpAL and CS-TD models. Specifically, we propose that the IT-basal-ganglia pathways represent goodness/badness of current options while the PT-indirect pathway represents the overall value of the previously chosen option, and both of these have influence on the DA neurons, through the basal-ganglia output, so that a variant of TD-RPE is calculated. A key assumption is that opposite directions of plasticity are induced upon phasic activation of DA neurons in the IT-indirect pathway and PT-indirect pathway because of different profiles of IT and PT inputs. Specifically, at PT -> indirect-pathway-medium-spiny-neuron (iMSN) synapses, sustained glutamatergic inputs generate rich adenosine, which allosterically prevents DA-D2 receptor signaling and instead favors adenosine-A2A receptor signaling. Then, phasic DA-induced phasic adenosine, which reflects TD-RPE, causes long-term synaptic potentiation. In contrast, at IT -> iMSN synapses where adenosine is scarce, phasic DA causes long-term synaptic depression via D2 receptor signaling. This new Opponency and Temporal-Difference (OTD) model provides unique predictions, part of which is potentially in line with recently reported activity patterns of neurons in the globus pallidus externus on the indirect pathway.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.3
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据