4.6 Article

Erasure Coding for Distributed Matrix Multiplication for Matrices With Bounded Entries

期刊

IEEE COMMUNICATIONS LETTERS
卷 23, 期 1, 页码 8-11

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/LCOMM.2018.2880213

关键词

Distributed computing; erasure codes; stragglers

资金

  1. National Science Foundation (NSF) [CCF-1718470]

向作者/读者索取更多资源

Distributed matrix multiplication is widely used in several scientific domains. It is well recognized that computation times on distributed clusters are often dominated by the slowest workers (called stragglers). Recent work has demonstrated that straggler mitigation can be viewed as a problem of designing erasure codes. For matrices A and B, the technique essentially maps the computation of A (T) B into the multiplication of smaller (coded) submatrices. The stragglers are treated as erasures in this process. The computation can be completed as long as a certain number of workers (called the recovery threshold) complete their assigned tasks. We present a novel coding strategy for this problem when the absolute values of the matrix entries are sufficiently small. We demonstrate a tradeoff between the assumed absolute value bounds on the matrix entries and the recovery threshold. At one extreme, we are optimal with respect to the recovery threshold, and on the other extreme, we match the threshold of prior work. Experimental results on cloud-based clusters validate the benefits of our method.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据