3.8 Proceedings Paper

On the vanishing and exploding gradient problem in Gated Recurrent Units

Journal

IFAC PAPERSONLINE
Volume 53, Issue 2, Pages 1243-1248

Publisher

ELSEVIER
DOI: 10.1016/j.ifacol.2020.12.1342

Keywords

Nonlinear system identification; Recurrent Neural Networks; Gated Recurrent Units

Ask authors/readers for more resources

Recurrent Neural Networks are applied in areas such as speech recognition, natural language and video processing, and the identification of nonlinear state space models. Conventional Recurrent Neural Networks, e.g. the Elman Network, are hard to train. A more recently developed class of recurrent neural networks, so-called Gated Units, outperform their counterparts on virtually every task. This paper aims to provide additional insights into the differences between RNNs and Gated Units in order to explain the superior perfomance of gated recurrent units. It is argued, that Gated Units are easier to optimize not because they solve the vanishing gradient problem, but because they circumvent the emergence of large local gradients. Copyright (C) 2020 The Authors.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available