549

LW-FedSSL: Resource-efficient Layer-wise Federated Self-supervised Learning

Main:16 Pages
13 Figures
Bibliography:4 Pages
14 Tables
Abstract

Many studies integrate federated learning (FL) with self-supervised learning (SSL) to take advantage of raw data distributed across edge devices. However, edge devices often struggle with high computational and communication costs imposed by SSL and FL algorithms. With the deployment of more complex and large-scale models, such as Transformers, these challenges are exacerbated. To tackle this, we propose the Layer-Wise Federated Self-Supervised Learning (LW-FedSSL) approach, which allows edge devices to incrementally train a small part of the model at a time. Specifically, in LW-FedSSL, training is decomposed into multiple stages, with each stage responsible for only a specific layer (or a block of layers) of the model. Since only a portion of the model is active for training at any given time, LW-FedSSL significantly reduces computational requirements. Additionally, only the active model portion needs to be exchanged between the FL server and clients, reducing the communication overhead. This enables LW-FedSSL to jointly address both computational and communication challenges in FL. Depending on the SSL algorithm used, it can achieve up to a 3.34×3.34 \times reduction in memory usage, 4.20×4.20 \times fewer computational operations (GFLOPs), and a 5.07×5.07 \times lower communication cost while maintaining performance comparable to its end-to-end training counterpart. Furthermore, we explore a progressive training strategy called Prog-FedSSL, which offers a 1.84×1.84\times reduction in GFLOPs and a 1.67×1.67\times reduction in communication costs while maintaining the same memory requirements as end-to-end training. While the resource efficiency of Prog-FedSSL is lower than that of LW-FedSSL, its performance improvements make it a viable candidate for FL environments with more lenient resource constraints.

View on arXiv
Comments on this paper