4.6 Review

A novel reduced parameter s-model of estimator learning automata in the switching non-stationary environment

期刊

NEURAL COMPUTING & APPLICATIONS
卷 34, 期 9, 页码 6811-6824

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s00521-021-06777-y

关键词

Reinforcement learning; Learning automata; S-model; Switching non-stationary environment; Reduced parameter; Stochastic estimator

资金

  1. Science Foundation of North China University of Technology [110051360002]
  2. Basic Scientific Research from Beijing Education Commission [110052972027]
  3. National Nature Science Foundation of China [61971283]

向作者/读者索取更多资源

This paper introduces a scheme to determine the parameter searching scope for SELA and subsequently presents a series of parameter searching methods, making SELA applicable for any environment with switching non-stationary characteristics. Furthermore, a reduced parameter SELA supported by a new two-dimensional parameter searching method emerges to decrease tuning cost. Experimental simulations demonstrate that rpS-SELA outperforms others with a reduced tuning cost, minor time consumption, higher accuracy rate, and a stronger tracking ability to the environmental switches.
Learning automata (LA), a powerful tool for reinforcement learning in the field of machine learning, could explore its optimal state by continuously interacting with an external environment. Generally, the traditional LA algorithms, especially estimator LA algorithms, can be ultimately abstracted out as P- or Q-models, which are simply located in the stationary environments. A more comprehensive consideration would be S-model operating in the non-stationary environment. For this specific LA, presently the most popular achievement belongs to stochastic estimator LA (SELA). However, synchronously handing four parameters involved in SELA is an intractable job, as these parameters may vary dramatically in values under different environments, making it essential to develop a strategy for parameter tuning. In this paper, we first propose a scheme to determine the parameter searching scope and subsequently present a series of parameter searching methods, including a four-dimensional method and a two-dimensional method, making SELA applicable for any environment with switching non-stationary characteristics. Furthermore, to decrease the tuning cost, a reduced parameter SELA supported by the new two-dimensional parameter searching method emerges. And to break the traditional limit that the environmental reward probability must be symmetrically distributed, the S-model is constructed from a new perspective, thus forming a novel reduced parameter S-model of SELA (rpS-SELA). A detailed mathematical proof theoretically reveals the absolute expediency of rpS-SELA. In addition, it is demonstrated by experimental simulations that rpS-SELA outperforms others with a reduced tuning cost, a minor time consumption, a higher accuracy rate, and above all, a stronger tracking ability to the environmental switches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据