4.7 Article

A fractional gradient descent algorithm robust to the initial weights of multilayer perceptron

Journal

NEURAL NETWORKS
Volume 158, Issue -, Pages 154-170

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.neunet.2022.11.018

Keywords

Initial weights; Multilayer perceptron (MLP); Robust; First derivative; Fractional calculus; Convergence

Ask authors/readers for more resources

This paper proposes a fractional gradient descent (RFGD) algorithm that demonstrates strong robustness to the initial weights of multilayer perceptron (MLP). Experimental results show that the RFGD algorithm outperforms other algorithms in terms of robustness for the initial weights of MLP. The effectiveness and convergence of the algorithm are also analyzed.
For multilayer perceptron (MLP), the initial weights will significantly influence its performance. Based on the enhanced fractional derivative extend from convex optimization, this paper proposes a fractional gradient descent (RFGD) algorithm robust to the initial weights of MLP. We analyze the effectiveness of the RFGD algorithm. The convergence of the RFGD algorithm is also analyzed. The computational complexity of the RFGD algorithm is generally larger than that of the gradient descent (GD) algorithm but smaller than that of the Adam, Padam, AdaBelief, and AdaDiff algorithms. Numerical experiments show that the RFGD algorithm has strong robustness to the order of fractional calculus which is the only added parameter compared to the GD algorithm. More importantly, compared to the GD, Adam, Padam, AdaBelief, and AdaDiff algorithms, the experimental results show that the RFGD algorithm has the best robust performance for the initial weights of MLP. Meanwhile, the correctness of the theoretical analysis is verified.(c) 2022 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available