4.6 Article

DIFFERENCE-OF-CONVEX LEARNING: DIRECTIONAL STATIONARITY, OPTIMALITY, AND SPARSITY

期刊

SIAM JOURNAL ON OPTIMIZATION
卷 27, 期 3, 页码 1637-1665

出版社

SIAM PUBLICATIONS
DOI: 10.1137/16M1084754

关键词

statistical learning; difference-of-convex programs; directional stationary points; optimality; sparsity

资金

  1. National Science Foundation [CMMI-1333902, IIS-1632971, DMS-1222507, DMS-1522383, IIS-1632935]
  2. Air Force Office of Scientific Research [FA9550-15-1-0126]
  3. Direct For Mathematical & Physical Scien [1522383] Funding Source: National Science Foundation

向作者/读者索取更多资源

This paper studies a fundamental bicriteria optimization problem for variable selection in statistical learning; the two criteria are a loss/residual function and a model control (also called regularization, penalty). The former function measures the fitness of the learning model to data and the latter function is employed as a control of the complexity of the model. We focus on the case where the loss function is (strongly) convex and the model control function is a difference of -convex (dc) sparsity measure. Our paper establishes some fundamental optimality and sparsity properties of directional stationary solutions to a nonconvex Lagrangian formulation of the bicriteria optimization problem, based on a specially structured dc representation of many well-known sparsity functions that can be profitably exploited in the analysis. We relate the Lagrangian optimization problem with the penalty constrained problem in terms of their respective d(irectional)-stationary solutions; this is in contrast to common analysis that pertains to the (global) minimizers of the problem which are not computable due to nonconvexity. Most importantly, we provide sufficient conditions under which the d(irectional)-stationary solutions of the nonconvex Lagrangian formulation are global minimizers (possibly restricted due to nondifferentiability), thereby filling the gap between previous minimizer-based analysis and practical computational considerations. The established relation allows us to readily apply the derived results for the Lagrangian formulation to the penalty constrained formulation. Specializations of the conditions to exact and surrogate sparsity functions are discussed, yielding optimality and sparsity results for existing nonconvex formulations of the statistical learning problem.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据