4.6 Article

Analysis of biased stochastic gradient descent using sequential semidefinite programs

期刊

MATHEMATICAL PROGRAMMING
卷 187, 期 1-2, 页码 383-408

出版社

SPRINGER HEIDELBERG
DOI: 10.1007/s10107-020-01486-1

关键词

Biased stochastic gradient; Robustness to inexact gradient; Convergence rates; Convex optimization; First-order methods

资金

  1. NSF [1656951, 1750162, 1254129]
  2. NASA Langley NRA Cooperative Agreement [NNX12AM55A]
  3. Wisconsin Institute for Discovery
  4. College of Engineering
  5. Department of Electrical and Computer Engineering at the University of Wisconsin-Madison
  6. Division of Computing and Communication Foundations
  7. Direct For Computer & Info Scie & Enginr [1750162, 1656951] Funding Source: National Science Foundation
  8. Div Of Civil, Mechanical, & Manufact Inn
  9. Directorate For Engineering [1254129] Funding Source: National Science Foundation
  10. NASA [NNX12AM55A, 69695] Funding Source: Federal RePORTER

向作者/读者索取更多资源

A convergence rate analysis for biased stochastic gradient descent is presented, with stochastic quadratic constraints and a small linear matrix inequality formulated to provide convergence bounds. A sequential minimization approach is developed to analyze trade-offs involving stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. Theoretical formulas are obtained to quantify the convergence properties of biased SGD under various assumptions on the loss functions.
We present a convergence rate analysis for biased stochastic gradient descent (SGD), where individual gradient updates are corrupted by computation errors. We develop stochastic quadratic constraints to formulate a small linear matrix inequality (LMI) whose feasible points lead to convergence bounds of biased SGD. Based on this LMI condition, we develop a sequential minimization approach to analyze the intricate trade-offs that couple stepsize selection, convergence rate, optimization accuracy, and robustness to gradient inaccuracy. We also provide feasible points for this LMI and obtain theoretical formulas that quantify the convergence properties of biased SGD under various assumptions on the loss functions.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据