Paper

CAFE AU LAIT: Compute-Aware Federated Augmented Low-Rank AI Training

Wednesday, June 18, 2025

12:30

13:00

CEST

Climate, Weather and Earth Sciences

Chemistry and Materials

Computer Science and Applied Mathematics

Engineering

Life Sciences

Physics

Presenter

Heidi

Hanson

Oak Ridge National Laboratory

I am the group lead of Bioinformatics and Biostatistics and Oak Ridge National Laboratory. I am a demographer and life course epidemiologist by training and my research interests are focused on computational approaches to population health surveillance.

Watch replay

Description

Federated finetuning is essential for unlocking the knowledge embedded in pretrained Large Language Models (LLMs) when data is distributed across clients. Unlike single-institution finetuning, federated finetuning enables collaboration across decentralized datasets while preserving data privacy. To address the high computing costs of LLM training and improve energy efficiency in Federated Learning (FL), Low-Rank Adaptation (LoRA) has gained popularity due to its reduced number of trainable parameters. However, this approach assumes all clients have sufficient computing resources, which is often unrealistic due to the heterogeneity of resources across clients. While some clients may access powerful GPUs, others have limited or no such resources. Federated finetuning using synthetic data allows participation without local LLM training but introduces a performance gap compared to local updates. To address this, we propose a novel two-stage algorithm leveraging the storage and computing power of a strong server. In the first stage, resource-constrained clients generate synthetic data under the coordination of the strong server, which is stored on the strong server. In the second stage, the strong server uses this synthetic data on behalf of constrained clients to perform federated LoRA finetuning alongside clients with sufficient resources. This ensures participation from all clients. Experimental results demonstrate that incorporating local updates from even a small fraction of clients improves performance compared to using synthetic data for all clients. Additionally, we integrate the Gaussian mechanism in both stages to ensure client-level differential privacy.

Authors

Jiayi

Wang

Oak Ridge National Laboratory

John

Gounley

Oak Ridge National Laboratory

Heidi

Hanson

Oak Ridge National Laboratory

Bookmark
this session

Unbookmark
this session

Saving...