4.8 Article

Toward Communication-Efficient Federated Learning in the Internet of Things With Edge Computing

期刊

IEEE INTERNET OF THINGS JOURNAL
卷 7, 期 11, 页码 11053-11067

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JIOT.2020.2994596

关键词

Training; Machine learning; Convergence; Internet of Things; Computational modeling; Servers; Adaptation models; Adaptive optimizer; deep learning; edge computing; federated learning; gradient sparsification

资金

  1. National Key Research and Development Program of China [2018YFB1800502]
  2. National Natural Science Foundation of China [61671079, 61771068]
  3. Beijing Municipal Natural Science Foundation [4182041]
  4. Ministry of Education and China Mobile Joint Fund [MCM20180101]

向作者/读者索取更多资源

Federated learning is an emerging concept that trains the machine learning models with the local distributed data sets, without sending the raw data to the data center. But, in the Internet of Things (IoT) where the wireless network resource is constrained, the key problem of federated learning is the communication overhead for parameter synchronization, which wastes bandwidth, increases training time, and even impacts the model accuracy. Gradient sparsification has received increasing attention, which only updates significant gradients and accumulates insignificant gradients locally. However, how to preserve the accuracy after a high ratio sparsification has been ignored in the literature. In this article, a general gradient sparsification (GGS) framework is proposed for adaptive optimizers, to correct the sparse gradient update process. It consists of two important mechanisms: 1) gradient correction and 2) batch normalization (BN) update with local gradients. With gradient correction, the optimizer can properly treat the accumulated insignificant gradients, which makes the model converge better. Furthermore, updating the BN layer with local gradients can relieve the impact of delayed gradients without increasing the communication overhead. We have conducted experiments on LeNet-5, CifarNet, DenseNet-121, and AlexNet with adaptive optimizers. Results show that when 99.9% gradients are sparsified, validation data sets are maintained with top-1 accuracy.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据