4.5 Article

Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration, and Lower Bounds

Journal

IEEE TRANSACTIONS ON INFORMATION THEORY
Volume 69, Issue 6, Pages 3850-3879

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TIT.2023.3247559

Keywords

Sketching; distributed optimization; randomized algorithms; convex optimization; regularized least squares; second order optimization; large scale problems; differential privacy

Ask authors/readers for more resources

We study distributed optimization methods for computationally challenging problems when forming the Hessian is difficult and communication is a bottleneck. We use randomized sketches to reduce the problem dimensions, preserve privacy, and improve straggler resilience in asynchronous distributed systems. We provide novel approximation guarantees and tight concentration results for sketching methods and extend the analysis to parameter averaging accuracy. We also develop unbiased parameter averaging methods for randomized second order optimization with sketched Hessian, and provide closed-form formulas to minimize bias for sketched Newton directions.
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging and communication is a significant bottleneck. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems. We derive novel approximation guarantees for classical sketching methods and establish tight concentration results that serve as both upper and lower bounds on the error. We then extend our analysis to the accuracy of parameter averaging for distributed sketches. Furthermore, we develop unbiased parameter averaging methods for randomized second order optimization in regularized problems that employ sketching of the Hessian. Existing works do not take the bias of the estimators into consideration, which limits their application to massively parallel computation. We provide closed-form formulas for regularization parameters and step sizes that provably minimize the bias for sketched Newton directions. Additionally, we demonstrate the implications of our theoretical findings via large scale experiments on a serverless cloud computing platform.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available