☆ 4.5 Article

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

NEURAL COMPUTATION (2007)

期刊

NEURAL COMPUTATION

卷 19, 期 6, 页码 1468-1502

出版社

MIT PRESS

DOI: 10.1162/neco.2007.19.6.1468

关键词

类别

Computer Science, Artificial Intelligence Neurosciences

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as 73 spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity, by applying a reinforcement learning algorithm to the stochastic spike response model of spiking neurons. These rules have several features common to plasticity mechanisms experimentally found in the brain. We then demonstrate in simulations of networks of integrate-and-fire neurons the efficacy of two simple learning rules involving modulated STDP. One rule is a direct extension of the standard STDP model (modulated STDP), and the other one involves an eligibility trace stored at each synapse that keeps a decaying, memory of the relationships between the recent pairs of preand postsynaptic spike pairs (modulated STDP with eligibility trace). This latter rule permits learning even if the reward signal is delayed. The proposed rules are able to solve the XOR problem with both ratecoded and temporally coded input and to learn a target output firing-rate pattern. These learning rules are biologically plausible, may be used for training generic artificial spiking neural networks, regardless of the neural model used, and suggest the experimental investigation in animals of the existence of reward-modulated STDP.

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

期刊

NEURAL COMPUTATION

出版社

MIT PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity

期刊

NEURAL COMPUTATION

出版社

MIT PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文