4.7 Article

Compressed Gradient Methods With Hessian-Aided Error Compensation

Journal

IEEE TRANSACTIONS ON SIGNAL PROCESSING
Volume 69, Issue -, Pages 998-1011

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TSP.2020.3048229

Keywords

Signal processing algorithms; Optimization; Error compensation; Compressors; Approximation algorithms; Machine learning algorithms; Convergence; Quantization; gradient methods; stochastic gradient descent

Funding

  1. Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP) - Knut and Alice Wallenberg Foundation
  2. Swedish Research Council (Vetenskapsrdet) [2020-03607]
  3. Swedish Research Council [2020-03607] Funding Source: Swedish Research Council

Ask authors/readers for more resources

The paper discusses the shift in optimization algorithm operating regime in the era of big data and introduces the improvement in solution accuracy through compression error compensation. Through experiments, it shows the advantages of Hessian-aided error compensation in quadratic problems and the strong convergence guarantees for stochastic gradient descent.
The emergence of big data has caused a dramatic shift in the operating regime for optimization algorithms. The performance bottleneck, which used to be computations, is now often communications. Several gradient compression techniques have been proposed to reduce the communication load at the price of a loss in solution accuracy. Recently, it has been shown how compression errors can be compensated for in the optimization algorithm to improve the solution accuracy. Even though convergence guarantees for error-compensated algorithms have been established, there is very limited theoretical support for quantifying the observed improvements in solution accuracy. In this paper, we show that Hessian-aided error compensation, unlike other existing schemes, avoids accumulation of compression errors on quadratic problems. We also present strong convergence guarantees of Hessian-based error compensation for stochastic gradient descent. Our numerical experiments highlight the benefits of Hessian-based error compensation, and demonstrate that similar convergence improvements are attained when only a diagonal Hessian approximation is used.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available