Research Work

Kunjal PanchalNisarg ParikhSunav ChoudharyLijun ZhangYuriy BrunHui Guan

Thinking Forward: Memory-Efficient Federated Finetuning of Language Models

Published @ NeurIPS, 2024September 2024

Finetuning large language models (LLMs) in federated learning (FL) settings has become important as it allows resource-constrained devices to finetune a model using private data. However, finetuning LLMs using backpropagation requires excessive memory (especially from intermediate activations) for resource-constrained devices. While Forward-mode Auto-Differentiation (AD) can reduce memory footprint from activations, we observe that directly applying it to LLM finetuning results in slow convergence and poor accuracy. This work introduces Spry, an FL algorithm that splits trainable weights of an LLM among participating clients, such that each client computes gradients using Forward-mode AD that are closer estimates of the true gradients. Spry achieves a low memory footprint, high accuracy, and fast convergence. We theoretically show that the global gradients in Spry are unbiased estimates of true global gradients for homogeneous data distributions across clients, while heterogeneity increases bias of the estimates. We also derive Spry's convergence rate, showing that the gradients decrease inversely proportional to the number of FL rounds, indicating the convergence up to the limits of heterogeneity. Empirically, Spry reduces the memory footprint during training by 1.4-7.1× in contrast to backpropagation, while reaching comparable accuracy, across a wide range of language tasks, models, and FL settings. Spry reduces the convergence time by 1.2-20.3× and achieves 5.2-13.5% higher accuracy against state-of-the-art zero-order methods. When finetuning Llama2-7B with LoRA, compared to the peak memory usage of 33.9GB of backpropagation, Spry only consumes 6.2GB of peak memory. For OPT13B, the reduction is from 76.5GB to 10.8GB. Spry makes feasible previously impossible FL deployments on commodity mobile and edge devices.

Kunjal PanchalSunav ChoudharyNisarg ParikhLijun ZhangHui Guan

Flow: Fine-grained Personalized Federated Learning through Dynamic Routing

Published @ NeurIPS, 2023; Preliminary Presentation @ CrossFL, MLSys 2022December 2023

Personalization in Federated Learning (FL) has been proven effective for incentivizing clients to participate in the training. However, personalization has been only studied at a coarse granularity where all the input instances of a client (heterogeneous or otherwise) only use its individual local model, despite it being limited to only that client's data. Flow explores instance-level personalization through dynamically making routing decisions between the local and the global model, with the aim of achieving superior personalized performance for a given instance. Besides, as cross-device FL deals with millions of resource-constrained client devices, we push towards stateless personalization where a client doesn't need to carry its personalized state across FL rounds.

Kunjal PanchalSunav ChoudharyKoyel MukherjeeSubrata MitraSomdeb SarkhelSaayan MitraHui Guan

Flash: Concept Drift Adaptation in Federated Learning

Published @ ICML, 2023July 2023

In Federated Learning (FL), adaptive optimization is an effective approach to addressing the statistical heterogeneity issue but cannot adapt quickly to concept drifts. In this work, we propose a novel adaptive optimizer called Flash that simultaneously addresses both statistical heterogeneity and the concept drift issues. The fundamental insight is that a concept drift can be detected based on the magnitude of parameter updates that are required to fit the global model to each participating client's local data distribution. Flash uses a two-pronged approach that synergizes client-side early-stopping training to facilitate detection of concept drifts and the server-side drift-aware adaptive optimization to effectively adjust effective learning rate. We theoretically prove that Flash matches the convergence rate of state-of-the-art adaptive optimizers and further empirically evaluate the efficacy of Flash on a variety of FL benchmarks using different concept drift settings.

Zhiqiu JiangMashrur RashikKunjal PanchalMahmood JasimAli SarvghadPari RiahiErica DewittFey ThurberNarges Mahyar

CommunityBots: Creating and Evaluating A Multi-Agent Chatbot Platform for Public Input Elicitation

Published @ ACM CSCW, 2023April 2023

In recent years, the popularity of AI-enabled conversational agents or chatbots has risen as an alternative to traditional online surveys to elicit information from people. However, there is a gap in using single-agent chatbots to converse and gather multi-faceted information across a wide variety of topics. Prior works suggest that single-agent chatbots struggle to understand user intentions and interpret human language during a multi-faceted conversation. In this work, we investigated how multi-agent chatbot systems can be utilized to conduct a multi-faceted conversation across multiple domains. To that end, we conducted a Wizard of Oz study to investigate the design of a multi-agent chatbot for gathering public input across multiple high-level domains and their associated topics. Next, we designed, developed, and evaluated CommunityBots - a multi-agent chatbot platform where each chatbot handles a different domain individually. To manage conversation across multiple topics and chatbots, we proposed a novel Conversation and Topic Management (CTM) mechanism that handles topic-switching and chatbot-switching based on user responses and intentions. We conducted a between-subject study comparing CommunityBots to a single-agent chatbot baseline with 96 crowd workers. The results from our evaluation demonstrate that CommunityBots participants were significantly more engaged, provided higher quality responses, and experienced fewer conversation interruptions while conversing with multiple different chatbots in the same session. We also found that the visual cues integrated with the interface helped the participants better understand the functionalities of the CTM mechanism, which enabled them to perceive changes in textual conversation, leading to better user satisfaction. Based on the empirical insights from our study, we discuss future research avenues for multi-agent chatbot design and its application for rich information elicitation.

Stay updated on my literary quest.

Contact me

The best time to plant a tree was 20 years ago. The second best time is now.