4.7 Article

Dynamic and adaptive fault-tolerant asynchronous federated learning using volunteer edge devices

Publisher

ELSEVIER
DOI: 10.1016/j.future.2022.02.024

Keywords

Federated learning; Edge computing; Internet browser; Distributed computing; Volunteer computing; Deep learning

Funding

  1. Universidad de Malaga, Spain
  2. Consejeria de Economia y Conocimiento de la Junta de Andalucia, Spain
  3. FEDER, Spain - MCIN/AEI, Spain [UMA18-FEDERJA-003, PID 2020-116727RB-I00]
  4. TAILOR ICT-48 Network - EU Horizon 2020 research and innovation programme [952215]
  5. FPU grant from the Ministerio de Educacion, Cultura y Deporte, Gobierno de Espana, Spain [FPU16/02595]
  6. Universidad de Malaga/CBUA, Spain

Ask authors/readers for more resources

The combination of edge computing and federated learning, known as federated edge learning, provides a solution for processing and protecting a large amount of data from interconnected devices. This research focuses on adapting to the changing environment through asynchronous learning and utilizing volunteer device resources for shared model training.
The number of devices, from smartphones to IoT hardware, interconnected via the Internet is growing all the time. These devices produce a large amount of data that cannot be analyzed in any data center or stored in the cloud, and it might be private or sensitive, thus precluding existing classic approaches. However, studying these data and gaining insights from them is still of great relevance to science and society. Recently, two related paradigms try to address the above problems. On the one hand, edge computing (EC) suggests to increase processing on edge devices. On the other hand, federated learning (FL) deals with training a shared machine learning (ML) model in a distributed (non-centralized) manner while keeping private data locally on edge devices. The combination of both is known as federated edge learning (FEEL). In this work, we propose an algorithm for FEEL that adapts to asynchronous clients joining and leaving the computation. Our research focuses on adapting the learning when the number of volunteers is low and may even drop to zero. We propose, implement, and evaluate a new software platform for this purpose. We then evaluate its results on problems relevant to FEEL. The proposed decentralized and adaptive system architecture for asynchronous learning allows volunteer users to yield their device resources and local data to train a shared ML model. The platform dynamically self-adapts to variations in the number of collaborating heterogeneous devices due to unexpected disconnections (i.e., volunteers can join and leave at any time). Thus, we conduct comprehensive empirical analysis in a static configuration and highly dynamic and changing scenarios. The public open-source platform enables interoperability between volunteers connected using web browsers and Python processes. We show that our platform adapts well to the changing environment getting a numerical accuracy similar to today's configurations using a given number of homogeneous (hardware and software) computers as a static platform for learning. We demonstrate the fault-tolerance of the platform in self-recovering from unexpected disconnections of volunteer devices. We then prove that EC, coupled with FL, can lead to scientific tools that can be practical involving real users for final competitive numerical results in real problems for science and society. (c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available