4.7 Article

Knowledge Selection and Local Updating Optimization for Federated Knowledge Distillation With Heterogeneous Models

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSTSP.2022.3223526

关键词

Federated learning; knowledge distillation; knowledge selection; heterogenous networks

向作者/读者索取更多资源

Federated learning is a distributed learning paradigm that allows edge devices to collaborate on training a shared model while preserving privacy. This paper introduces knowledge distillation to handle model heterogeneity, and proposes an algorithm for efficient knowledge aggregation. It also presents a threshold-based technique to optimize the local model updating and distillation intensity. Experimental results demonstrate the superior performance of the proposed approach compared to existing methods.
Federated learning (FL) is a promising distributed learning paradigm in which multiple edge devices (EDs) collaborate to train a shared model without exchanging privacy-sensitive raw data. With the heterogeneous architectures of local models in FL, knowledge distillation (KD) offers a new way to deal with model heterogeneity by aggregating knowledge instead of model parameters. Usually, there is no pre-trained teacher model in the FL setup, to apply KD in FL, it is critical to design efficient knowledge aggregation mechanism. To this end, in this paper, we first analyze the relationship between each local model convergence rate and the knowledge generated by selecting predicted logits. Then, an optimization problem based on this relationship is formulated to schedule the predicted logits for efficient knowledge aggregation. An iterative algorithm called predicted logits selection (PLS) is designed to solve this problem. After that, we propose a threshold-based technique to optimize the local model updating options with/without KD for each ED in order to limit the performance degradation of local models caused by misleading knowledge. Meanwhile, we optimize both the threshold selection and the reasonable distillation intensity during the FL process. Extensive experiments based on convolutional neural network (CNN) models and real-world datasets demonstrate the superior performance gain of the proposed approach over existing benchmark methods.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据