4.6 Article

Train Me If You Can: Decentralized Learning on the Deep Edge

Journal

APPLIED SCIENCES-BASEL
Volume 12, Issue 9, Pages -

Publisher

MDPI
DOI: 10.3390/app12094653

Keywords

federated learning; machine learning; artificial neural networks; artificial intelligence; machine learning algorithms; intelligent systems; internet of things; arm cortex-M

Funding

  1. FCT-Fundacao para a Ciencia e Tecnologia within the R&D Units Project Scope [UIDB/00319/2020]
  2. FCT within the PhD Scholarship Project Scope [SFRH/BD/146780/2019]

Ask authors/readers for more resources

The end of Moore's Law and concerns about data privacy are driving machine learning to shift from the cloud to the deep edge, a paradigm known as federated learning (FL). This article explores the feasibility of training artificial neural networks (ANNs) on Arm Cortex-M microcontroller units (MCUs). The authors propose L-SGD, a lightweight implementation of stochastic gradient descent (SGD) optimized for speed and minimal memory usage. Experimental results show a significant performance improvement with L-SGD compared to traditional SGD, making it suitable for specific application scenarios.
The end of Moore's Law aligned with data privacy concerns is forcing machine learning (ML) to shift from the cloud to the deep edge. In the next-generation ML systems, the inference and part of the training process will perform at the edge, while the cloud stays responsible for major updates. This new computing paradigm, called federated learning (FL), alleviates the cloud and network infrastructure while increasing data privacy. Recent advances empowered the inference pass of quantized artificial neural networks (ANNs) on Arm Cortex-M and RISC-V microcontroller units (MCUs). Nevertheless, the training remains confined to the cloud, imposing the transaction of high volumes of private data over a network and leading to unpredictable delays when ML applications attempt to adapt to adversarial environments. To fill this gap, we make the first attempt to evaluate the feasibility of ANN training in Arm Cortex-M MCUs. From the available optimization algorithms, stochastic gradient descent (SGD) has the best trade-off between accuracy, memory footprint, and latency. However, its original form and the variants available in the literature still do not fit the stringent requirements of Arm Cortex-M MCUs. We propose L-SGD, a lightweight implementation of SGD optimized for maximum speed and minimal memory footprint in this class of MCUs. We developed a floating-point version and another that operates over quantized weights. For a fully-connected ANN trained on the MNIST dataset, L-SGD (float-32) is 4.20x faster than the SGD while requiring only 2.80% of the memory with negligible accuracy loss. Results also show that quantized training is still unfeasible to train an ANN from the scratch but is a lightweight solution to perform minor model fixes and counteract the fairness problem in typical FL systems.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available