4.7 Article

On Distributed Nonconvex Optimization: Projected Subgradient Method for Weakly Convex Problems in Networks

期刊

IEEE TRANSACTIONS ON AUTOMATIC CONTROL
卷 67, 期 2, 页码 662-675

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TAC.2021.3056535

关键词

Convergence; Optimization; Machine learning; Convex functions; Stochastic processes; Gradient methods; Training; Distributed optimization; multiagent systems; nonconvex optimization; projected stochastic subgradient method

资金

  1. NSF [ECCS-1933878, AFOSR-15RT0767]

向作者/读者索取更多资源

The stochastic subgradient method is widely used for solving large-scale optimization problems in machine learning, especially when the problems are neither smooth nor convex. This article proposes a distributed implementation of the stochastic subgradient method with theoretical guarantee, demonstrating global convergence using the Moreau envelope stationarity measure. Additionally, it shows that deterministic DPSM linearly converges to sharp minima under a sharpness condition with geometrically diminishing step size.
The stochastic subgradient method is a widely used algorithm for solving large-scale optimization problems arising in machine learning. Often, these problems are neither smooth nor convex. Recently, Davis et al., 2018 characterized the convergence of the stochastic subgradient method for the weakly convex case, which encompasses many important applications (e.g., robust phase retrieval, blind deconvolution, biconvex compressive sensing, and dictionary learning). In practice, distributed implementations of the projected stochastic subgradient method (stoDPSM) are used to speed up risk minimization. In this article, we propose a distributed implementation of the stochastic subgradient method with a theoretical guarantee. Specifically, we show the global convergence of stoDPSM using the Moreau envelope stationarity measure. Furthermore, under a so-called sharpness condition, we show that deterministic DPSM (with a proper initialization) converges linearly to the sharp minima, using geometrically diminishing step size. We provide numerical experiments to support our theoretical analysis.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据