Paper
CAFE AU LAIT: Compute-Aware Federated Augmented Low-Rank AI Training
Presenter
I am the group lead of Bioinformatics and Biostatistics and Oak Ridge National Laboratory. I am a demographer and life course epidemiologist by training and my research interests are focused on computational approaches to population health surveillance.
Description
Federated finetuning is essential for unlocking the knowledge embedded in pretrained Large Language Models (LLMs) when data is distributed across clients. Unlike single-institution finetuning, federated finetuning enables collaboration across decentralized datasets while preserving data privacy. To address the high computing costs of LLM training and improve energy efficiency in Federated Learning (FL), Low-Rank Adaptation (LoRA) has gained popularity due to its reduced number of trainable parameters. However, this approach assumes all clients have sufficient computing resources, which is often unrealistic due to the heterogeneity of resources across clients. While some clients may access powerful GPUs, others have limited or no such resources. Federated finetuning using synthetic data allows participation without local LLM training but introduces a performance gap compared to local updates. To address this, we propose a novel two-stage algorithm leveraging the storage and computing power of a strong server. In the first stage, resource-constrained clients generate synthetic data under the coordination of the strong server, which is stored on the strong server. In the second stage, the strong server uses this synthetic data on behalf of constrained clients to perform federated LoRA finetuning alongside clients with sufficient resources. This ensures participation from all clients. Experimental results demonstrate that incorporating local updates from even a small fraction of clients improves performance compared to using synthetic data for all clients. Additionally, we integrate the Gaussian mechanism in both stages to ensure client-level differential privacy.