[Crib-list] Computational Research in Boston and Beyond (CRIBB) seminar - Praneeth Vepakomma (MIT/IDSS and MBZUAI) - April 4

Ian Wang mjwang79 at mit.edu
Fri Mar 28 09:01:44 EDT 2025


Computational Research in Boston and Beyond (CRIBB) seminar

Date: Friday, April 4, 2025

Time: 12:00 PM – 1:00 PM

Zoom: https://mit.zoom.us/j/91933017072

Add to calendar: Apple (iCal)<https://calendar.mit.edu/event/computational-research-in-boston-and-beyond-seminar-5622.ics>, Google (online)<https://calendar.google.com/calendar/event?action=TEMPLATE&dates=20250404T160000Z%2F20250404T170000Z&details=Speaker%3A+Praneeth+Vepakomma+%28MIT%2FIDSS+and+MBZUAI%29%0A%0ATitle%3A+Extremely-efficient+fine-tuning+of+LLMs%0A%0AAbstract%3A%0A%0ALarge+Language+Models+%28LLMs%29+have+reshaped+generative+AI%2C+but+fully+fine-tuning+these+massive+architectures+is+quite+expensive+in+computational+and+communication+resources.+Low-Rank+Adaptation+%28LoRA%29+partially+mitigates+these+challenges%2C+yet+conventional+LoRA+often+struggles+to+match+the+performance+of+full+fine-tuning.+In+this+talk%2C+I+introduce+LoRA-SB+%28LoRA+Silver+Bullet%29%2C+a+novel+approach+that+injects+a+constrained+update+space+into+LoRA%E2%80%99s+framework%2C+enabling+optimal+scaling+for+high-rank+gradient+directions+that+mimic+full+fine-tuning+in+a+low-rank+space%2C+and+meets+the+performance+of+full+fine-tuning.+We+theoretically+prove+that+our+initialization+strategy+provides+an+optimal+low-rank+approximation+of+the+initial+gradient+and+preserves+critical+update+directions+throughout+training.+Extensive+experiments+on+mathematical+reasoning%2C+commonsense+inference%2C+and+language+understanding+tasks+show+that+LoRA-SB+exceeds+the+performance+of+standard+LoRA+while+requiring+27%E2%80%9390%C3%97+fewer+trainable+parameters+and+comprehensively+outperforms+LoRA-XS.+Our+findings+demonstrate+that+it+is+not+only+possible+but+also+highly+effective+to+simulate+full+fine-tuning+in+low-rank+subspaces%2C...%0A%0Ahttps%3A%2F%2Fcalendar.mit.edu%2Fevent%2Fcomputational-research-in-boston-and-beyond-seminar-5622&location=&sprop=website%3Acalendar.mit.edu&text=Computational+Research+in+Boston+and+Beyond+Seminar>, Outlook<https://calendar.mit.edu/event/computational-research-in-boston-and-beyond-seminar-5622.ics>


Speaker: Praneeth Vepakomma (MIT/IDSS and MBZUAI)

Title: Extremely-efficient fine-tuning of LLMs

Abstract:
Large Language Models (LLMs) have reshaped generative AI, but fully fine-tuning these massive architectures is quite expensive in computational and communication resources. Low-Rank Adaptation (LoRA) partially mitigates these challenges, yet conventional LoRA often struggles to match the performance of full fine-tuning. In this talk, I introduce LoRA-SB (LoRA Silver Bullet), a novel approach that injects a constrained update space into LoRA’s framework, enabling optimal scaling for high-rank gradient directions that mimic full fine-tuning in a low-rank space, and meets the performance of full fine-tuning. We theoretically prove that our initialization strategy provides an optimal low-rank approximation of the initial gradient and preserves critical update directions throughout training. Extensive experiments on mathematical reasoning, commonsense inference, and language understanding tasks show that LoRA-SB exceeds the performance of standard LoRA while requiring 27–90× fewer trainable parameters and comprehensively outperforms LoRA-XS. Our findings demonstrate that it is not only possible but also highly effective to simulate full fine-tuning in low-rank subspaces, offering significant efficiency gains at no loss in accuracy. Additionally, we introduce Fed-SB, a federated extension of LoRA-SB that employs direct averaging of the small matrix R to guarantee exact updates and drastically reduce communication costs—independent of the number of clients—by up to 230×. Fed-SB further enhances privacy-utility-communication efficiency trade-offs by lowering noise requirements and avoiding noise amplification. Overall, it establishes a new Pareto frontier for efficient, scalable federated fine-tuning in both private and non-private settings.


For information about the Computational Research in Boston and Beyond (CRIBB) seminar, visit... https://math.mit.edu/crib/

Best regards,
Ian

--------
Ian Wang, he/him
Faculty Support
Department of Mathematics, MIT
Phone: (617)258-6283
Office: 2-372


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mit.edu/pipermail/crib-list/attachments/20250328/b09ae843/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CRiBB April 4.pdf
Type: application/pdf
Size: 546060 bytes
Desc: CRiBB April 4.pdf
URL: <http://mailman.mit.edu/pipermail/crib-list/attachments/20250328/b09ae843/attachment.pdf>


More information about the CRiB-list mailing list