Ali Jannesari


2026

Parameter-efficient fine-tuning (PEFT) adapts large pre-trained models by updating only a small subset of parameters. Recently, Representation Fine-Tuning (ReFT) has emerged as an effective alternative. ReFT shifts the fine-tuning paradigm from updating model weights to directly manipulating hidden representations that capture rich semantic information, and outperform state-of-the-art PEFTs in standalone settings. However, its application in Federated Learning (FL) remains challenging due to heterogeneity in clients’ data distributions, model capacities, and computational resources. To address these challenges, we introduce Federated Representation Fine-Tuning (FedReFT), a novel approach to fine-tune clients’ hidden representations. FedReFT applies sparse intervention layers to steer hidden representations directly, offering a lightweight and semantically rich fine-tuning alternative ideal for edge devices. However, representation-level updates are especially vulnerable to aggregation mismatch under different task heterogeneity, where naive averaging can corrupt semantic alignment. To mitigate this issue, we propose All-But-Me (ABM) aggregation, where each client receives the aggregated updates of others and partially incorporates them, enabling stable and personalized learning by balancing local focus with global knowledge. We further design an adaptive update strategy inspired by Test-Time Computing (TTC) to balance local and global contributions under heterogeneous conditions. FedReFT achieves state-of-the-art performance on commonsense reasoning, arithmetic reasoning, and GLUE benchmarks, while delivering 1x–49x higher parameter efficiency compared to leading LoRA-based methods.

2025

In-Context Learning (ICL) has been shown to be a powerful technique to augment the capabilities of LLMs for a diverse range of tasks. This work proposes AutoParLLM, a novel way to generate context using guidance from graph neural networks (GNNs) to generate efficient parallel codes. We evaluate AutoParLLM on 12 applications from two well-known benchmark suites of parallel codes: NAS Parallel Benchmark and Rodinia Benchmark. Our results show that AutoParLLM improves the state-of-the-art LLMs (e.g., GPT-4) by 19.9% in NAS and 6.48% in Rodinia benchmark in terms of CodeBERTScore for the task of parallel code generation. Moreover, AutoParLLM improves the ability of the most powerful LLM to date, GPT-4, by achieving 17% (on NAS benchmark) and 16% (on Rodinia benchmark) better speedup. In addition, we propose OMPScore for evaluating the quality of the parallel code and show its effectiveness in evaluating parallel codes.

2024

Foundation Models (FMs), such as LLaMA, BERT, GPT, ViT, and CLIP, have demonstrated remarkable success in a wide range of applications, driven by their ability to leverage vast amounts of data for pre-training. However, optimizing FMs often requires access to sensitive data, raising privacy concerns and limiting their applicability in many domains. In this paper, we propose the Federated Foundation Models (FFMs) paradigm, which combines the benefits of FMs and Federated Learning (FL) to enable privacy-preserving and collaborative learning across multiple end-users. We discuss the potential benefits and challenges of integrating FL into the lifespan of FMs, covering pre-training, fine-tuning, and application. We further outline potential future research avenues in FFM, including FFM pre-training, FFM fine-tuning, and federated prompt tuning, which allow the development of more personalized and context-aware models while ensuring data privacy. Moreover, we explore the possibility of continual/lifelong learning in FFMs, as increased computational power at the edge may unlock the potential for optimizing FMs using newly generated private data close to the data source. The proposed FFM concepts offer a flexible and scalable framework for training large language models in a privacy-preserving manner, setting the stage for subsequent advancements in both FM training and federated learning.