Enabling Federated Learning across the Computing Continuum: Systems, Challenges and Future Directions
Abstract
In recent years, as the boundaries of computing have expanded with the emergence of the Internet of Things (IoT) and its increasing number of devices continuously producing flows of data, it has become paramount to boost speed and to reduce latency. Recent approaches to this growing complexity and data deluge aim to integrate seamlessly and securely diverse computing tiers and data environments, spanning from core cloud to edge - the Computing Continuum (or Edge-to-Cloud Continuum).
Typically, the cloud is used for resource-intensive computations while the edge for low-latency tasks. This provides an opportunity to run complex AI-enabled applications across multiple tiers specifically facilitating the training of Machine Learning (ML) models at the "edge" of the Internet (i.e., beyond centralized computing facilities such as cloud datacenters). Federated Learning (FL) represents a novel ML paradigm for collaborative training, capitalizing on processing capabilities at the edge for training purposes while addressing privacy concerns. A set of clients (i.e., edge devices) collaboratively train a shared model under the supervision of a centralized server without exchanging personal data. However, several challenges arise from the decentralized nature of FL in the Computing Continuum context: statistical heterogeneity (data drift between parties), system heterogeneity (due to the nature of the environment), volatility (e.g., client dropouts), security threats and, persistently, privacy (although no personal data is transmitted, the shared model updates include information about data used for training).
As opposed to previous studies dedicated to federated learning (typically on homogeneous, edge-based infrastructures), this survey aims to present a systematic overview of the existing literature addressing how state-of-the-art Federated Learning (FL) systems contend with the challenges previously outlined within the edge-to-cloud Computing Continuum, in particular heterogeneity, volatility and large-scale distribution. We analyze representative tools for implementing, monitoring, configuring and deploying such systems. We highlight significant efforts made to overcome statistical heterogeneity and security problems in FL. We specifically analyze the quality of the experimental evaluation done for existing systems and the relevant benchmarks. Finally, we discuss some open issues and future directions (e.g., lack of experiments in realistic environments) to support the broader adoption of FL across the continuum and to eventually fulfill the vision of the AI and edge computing convergence - the edge intelligence.
Fichier principal
Enabling_FL_across_the_CC_Systems_Challenges_and_Future_Directions.pdf (1.05 Mo)
Télécharger le fichier
Origin | Files produced by the author(s) |
---|---|
licence |