Accelerating distributed machine learning with model compression and graph partition

Article Computer Science, Artificial Intelligence

Learned Gradient Compression for Distributed Deep Learning

Lusine Abrahamyan et al.

Summary: The study proposes a gradient compression method based on distributed learning, which improves compression efficiency by leveraging inter-node gradient correlations. Experimental results show significant compression effects across different datasets and deep learning models.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2022)

Add to Collection

Article Computer Science, Theory & Methods

Fast shared-memory streaming multilevel graph partitioning

Nazanin Jafari et al.

Summary: This research evaluates the feasibility of using streaming graph partitioning algorithms within a multilevel framework, aiming to propose a fast parallel offline multilevel partitioner. The results show that this approach is on average up to 5.1x faster than the multi-threaded MeTiS, with a cutsize quality only 2x worse.

JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING (2021)

Add to Collection

Article Computer Science, Theory & Methods

Overlapping Communication With Computation in Parameter Server for Scalable DL Training

Shaoqi Wang et al.

Summary: The proposed method iPart effectively improves scalability by overlapping gradient communication with backward computation and parameter communication with forward computation in various partition sizes, outperforming the default PS and layer by layer strategy in experimental evaluations.

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2021)

Add to Collection

Article Engineering, Multidisciplinary

Minimizing Training Time of Distributed Machine Learning by Reducing Data Communication

Yubin Duan et al.

Summary: The study investigates an optimization problem for data and parameter allocation in distributed machine learning, proposing a solution to minimize total training time and demonstrating significant improvements in communication efficiency through experiments.

IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING (2021)

Add to Collection

Article Engineering, Electrical & Electronic