4.7 Article

Robust Stochastic Gradient Descent With Student-t Distribution Based First-Order Momentum

Journal

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TNNLS.2020.3041755

Keywords

Training; Stochastic processes; Robustness; Estimation; Noise measurement; Proposals; Neural networks; Deep neural networks; robust optimization; stochastic gradient descent (SGD); student-t distribution

Ask authors/readers for more resources

This paper proposes a robust stochastic gradient optimization method based on the student-t distribution, in which the robustness is directly built in the algorithm. By integrating the method into several stochastic gradient algorithms, including the widely used optimizer Adam, the resultant algorithm outperforms Adam and their original versions in terms of robustness against noise across various tasks.
Remarkable achievements by deep neural networks stand on the development of excellent stochastic gradient descent methods. Deep-learning-based machine learning algorithms, however, have to find patterns between observations and supervised signals, even though they may include some noise that hides the true relationship between them, more or less especially in the robotics domain. To perform well even with such noise, we expect them to be able to detect outliers and discard them when needed. We, therefore, propose a new stochastic gradient optimization method, whose robustness is directly built in the algorithm, using the robust student-t distribution as its core idea. We integrate our method to some of the latest stochastic gradient algorithms, and in particular, Adam, the popular optimizer, is modified through our method. The resultant algorithm, called t-Adam, along with the other stochastic gradient methods integrated with our core idea is shown to effectively outperform Adam and their original versions in terms of robustness against noise on diverse tasks, ranging from regression and classification to reinforcement learning problems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available