☆ 3.8 Proceedings Paper

On the vanishing and exploding gradient problem in Gated Recurrent Units

IFAC PAPERSONLINE (2020)

Journal

IFAC PAPERSONLINE

Volume 53, Issue 2, Pages 1243-1248

Publisher

ELSEVIER

DOI: 10.1016/j.ifacol.2020.12.1342

Keywords

Nonlinear system identification; Recurrent Neural Networks; Gated Recurrent Units

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

Recurrent Neural Networks are applied in areas such as speech recognition, natural language and video processing, and the identification of nonlinear state space models. Conventional Recurrent Neural Networks, e.g. the Elman Network, are hard to train. A more recently developed class of recurrent neural networks, so-called Gated Units, outperform their counterparts on virtually every task. This paper aims to provide additional insights into the differences between RNNs and Gated Units in order to explain the superior perfomance of gated recurrent units. It is argued, that Gated Units are easier to optimize not because they solve the vanishing gradient problem, but because they circumvent the emergence of large local gradients. Copyright (C) 2020 The Authors.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8

Not enough ratings

On the vanishing and exploding gradient problem in Gated Recurrent Units

Journal

IFAC PAPERSONLINE

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

On the vanishing and exploding gradient problem in Gated Recurrent Units

Journal

IFAC PAPERSONLINE

Publisher

ELSEVIER

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper