<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Arial Unicode MS";
panose-1:2 11 6 4 2 2 2 2 2 4;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"\@Arial Unicode MS";
panose-1:2 11 6 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:10.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#467886;
text-decoration:underline;}
span.apple-converted-space
{mso-style-name:apple-converted-space;}
span.spelle
{mso-style-name:spelle;}
span.grame
{mso-style-name:grame;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="#467886" vlink="#96607D" style="word-wrap:break-word">
<div class="WordSection1">
<div>
<div>
<div id="mail-editor-reference-message-container">
<div>
<div>
<div>
<p class="MsoNormal"><b><span style="font-size:18.0pt;color:#215F9A">Computational Research in Boston and Beyond (CRIBB) seminar</span></b><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121"> </span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<b><span style="font-size:11.0pt;color:#212121">Date</span></b><span style="font-size:11.0pt;color:#212121">: Friday, April 4, 2025
</span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121"> </span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<b><span style="font-size:11.0pt;color:#212121">Time</span></b><span style="font-size:11.0pt;color:#212121">: 12:00 PM – 1:00 PM</span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121"> </span></p>
<div style="border:none;border-bottom:solid windowtext 1.0pt;padding:0in 0in 1.0pt 0in">
<p class="MsoNormal"><b><span style="font-size:11.0pt;color:#212121">Zoom</span></b><span style="font-size:11.0pt;color:#212121">:<span class="apple-converted-space"> </span><a href="https://mit.zoom.us/j/91933017072" title="https://mit.zoom.us/j/91933017072"><span style="font-size:12.0pt;color:#96607D">https://mit.zoom.us/j/91933017072</span></a></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#212121"> </span></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;color:#212121">Add to calendar</span></b><span style="font-size:11.0pt;color:#212121">:
</span><span style="font-size:11.0pt"><a href="https://calendar.mit.edu/event/computational-research-in-boston-and-beyond-seminar-5622.ics">Apple (iCal)</a><span style="color:#212121">,
<a href="https://calendar.google.com/calendar/event?action=TEMPLATE&dates=20250404T160000Z%2F20250404T170000Z&details=Speaker%3A+Praneeth+Vepakomma+%28MIT%2FIDSS+and+MBZUAI%29%0A%0ATitle%3A+Extremely-efficient+fine-tuning+of+LLMs%0A%0AAbstract%3A%0A%0ALarge+Language+Models+%28LLMs%29+have+reshaped+generative+AI%2C+but+fully+fine-tuning+these+massive+architectures+is+quite+expensive+in+computational+and+communication+resources.+Low-Rank+Adaptation+%28LoRA%29+partially+mitigates+these+challenges%2C+yet+conventional+LoRA+often+struggles+to+match+the+performance+of+full+fine-tuning.+In+this+talk%2C+I+introduce+LoRA-SB+%28LoRA+Silver+Bullet%29%2C+a+novel+approach+that+injects+a+constrained+update+space+into+LoRA%E2%80%99s+framework%2C+enabling+optimal+scaling+for+high-rank+gradient+directions+that+mimic+full+fine-tuning+in+a+low-rank+space%2C+and+meets+the+performance+of+full+fine-tuning.+We+theoretically+prove+that+our+initialization+strategy+provides+an+optimal+low-rank+approximation+of+the+initial+gradient+and+preserves+critical+update+directions+throughout+training.+Extensive+experiments+on+mathematical+reasoning%2C+commonsense+inference%2C+and+language+understanding+tasks+show+that+LoRA-SB+exceeds+the+performance+of+standard+LoRA+while+requiring+27%E2%80%9390%C3%97+fewer+trainable+parameters+and+comprehensively+outperforms+LoRA-XS.+Our+findings+demonstrate+that+it+is+not+only+possible+but+also+highly+effective+to+simulate+full+fine-tuning+in+low-rank+subspaces%2C...%0A%0Ahttps%3A%2F%2Fcalendar.mit.edu%2Fevent%2Fcomputational-research-in-boston-and-beyond-seminar-5622&location=&sprop=website%3Acalendar.mit.edu&text=Computational+Research+in+Boston+and+Beyond+Seminar">
Google (online)</a>, <a href="https://calendar.mit.edu/event/computational-research-in-boston-and-beyond-seminar-5622.ics">
Outlook</a></span></span></p>
<p class="MsoNormal"><span style="font-size:5.0pt;color:#212121"> </span></p>
</div>
<p class="MsoNormal"><span style="font-size:5.0pt;color:#212121"> </span></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;color:#212121">Speaker</span></b><span style="font-size:11.0pt;color:#212121">: Praneeth
<span class="spelle">Vepakomma</span> <i>(MIT/IDSS and MBZUAI)</i></span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121"> </span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<b><span style="font-size:11.0pt;color:#212121">Title</span></b><span style="font-size:11.0pt;color:#212121">:
<span class="grame">Extremely-efficient</span> fine-tuning of LLMs</span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121"> </span></p>
<p class="MsoNormal"><b><span style="font-size:11.0pt;color:#212121">Abstract</span></b><span style="font-size:11.0pt;color:#212121">:<span class="apple-converted-space"> </span></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:#212121">Large Language Models (LLMs) have reshaped generative AI, but fully fine-tuning these massive architectures is quite expensive in computational and communication resources. Low-Rank Adaptation
(<span class="spelle">LoRA</span>) partially mitigates these challenges, yet conventional
<span class="spelle">LoRA</span> often struggles to match the performance of full fine-tuning. In this talk, I introduce
<span class="spelle"><b>LoRA</b></span><b>-SB (<span class="spelle">LoRA</span> Silver Bullet)</b>, a novel approach that injects a constrained update space into
<span class="spelle">LoRA’s</span> framework, enabling optimal scaling for high-rank gradient directions that mimic full fine-tuning in a low-rank space, and meets the performance of full fine-tuning. We theoretically prove that our initialization strategy
provides an optimal low-rank approximation of the initial gradient and preserves critical update directions throughout training. Extensive experiments on mathematical reasoning, commonsense inference, and language understanding tasks show that
<span class="spelle">LoRA</span>-SB exceeds the performance of standard <span class="spelle">
LoRA</span> while requiring 27–90× fewer trainable parameters and comprehensively outperforms
<span class="spelle">LoRA</span>-XS. Our findings demonstrate that it is not only possible but also highly effective to simulate full fine-tuning in low-rank subspaces, offering significant efficiency gains at no loss in accuracy. Additionally, we introduce
<b>Fed-SB</b>, a federated extension of <span class="spelle">LoRA</span>-SB that employs direct averaging of the small matrix R to guarantee exact updates and drastically reduce communication costs—independent of the number of clients—by up to 230×. Fed-SB
further enhances privacy-utility-communication efficiency trade-offs by lowering noise requirements and avoiding noise amplification. Overall, it establishes a new Pareto frontier for efficient, scalable federated fine-tuning in both private and non-private
settings.</span></p>
<div style="border:none;border-bottom:solid windowtext 1.0pt;padding:0in 0in 1.0pt 0in">
<p class="MsoNormal"><span style="font-size:5.0pt;color:#212121"> </span></p>
</div>
<p class="MsoNormal"><span style="font-size:5.0pt;color:#212121"> </span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121">For information about the<span class="apple-converted-space"> </span><b>Computational Research in Boston and Beyond (CRIBB) seminar</b>, visit...</span><span class="apple-converted-space"><span style="font-size:12.0pt;color:#212121"> </span></span><span style="font-size:11.0pt;color:#212121"><a href="https://math.mit.edu/crib/" title="https://math.mit.edu/crib/"><span style="font-size:12.0pt;color:#96607D">https://math.mit.edu/crib/</span></a></span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121"> </span><o:p></o:p></p>
<p class="MsoNormal" style="font-variant-caps:normal;orphans:auto;text-align:start;widows:auto;word-spacing:0px">
<span style="font-size:11.0pt;color:#212121">Best regards,</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Ian </span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"> </span></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt">--------</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Ian Wang, he/him</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Faculty Support</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;color:black">Department of Mathematics, MIT</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Phone: (617)258-6283</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Office: 2-372</span></p>
<p class="MsoNormal"><span style="font-size:12.0pt"> </span></p>
</div>
</div>
<p class="MsoNormal"><span style="font-size:12.0pt"> </span></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>