Rate limiting Kerberos Requests

Jack Neely jjneely at ncsu.edu
Wed Sep 26 14:25:18 EDT 2012


On Tue, Sep 25, 2012 at 02:08:29PM -0700, Russ Allbery wrote:
> Jack Neely <jjneely at ncsu.edu> writes:
> 
> > Thanks for reading between the lines.  I don't have evidence that my
> > KDCs were overloaded, yet I got quite a few cannot reach KDC errors and
> > a logins stopped working everywhere.
> 
> > The slaves are HP G7 blades with 12GB of RAM and a 6 core Intel Xeon.  2
> > servers in one DC and the other slave (and master) in the other DC.
> > Each DC has its own firewall/vlan for the kerberos servers.  RHEL 5
> > running kerb 1.6.1.
> 
> > My network engineers tell me that the firewall in one DC had 8000
> > concurrent connections from the offending IP address to the KDCs and
> > 4000 in the second DC.  (Oddly, the DC with only 1 slave.)  The KDCs
> > weren't able to handle other requests until the spike settled.
> 
> Is it possible that, rather than overwhelming the KDC, you instead
> overwhelmed the UDP session table on your firewall?  Sometimes firewalls
> have surprisingly small UDP session tables, which can cause serious
> problems for Kerberos and for DNS servers.
> 
> You're right that there are ill-behaved Kerberos applications that will
> spam authentication requests, but I tend to think of this similar to the
> problem with DNS, where there are ill-behaved resolvers that do the same
> thing.  Fixing them tends to be really hard, but answering Kerberos
> requests should normally be extremely fast.  It's usually easier to just
> ensure you can handle the load spikes than worry too much about fixing all
> the broken clients.  (Of course, the rate limiting path that you're going
> down is one way to do that.)
> 
> We were quite concerned when we first looked at putting Kerberos KDCs
> behind a hardware firewall because of that session limit.  Our firewalls
> have a 100,000 UDP session limit and a fairly quick timeout.  One tuning
> that you can do on the hardware firewall, if that is the problem, is to
> reduce the UDP session length for Kerberos KDC traffic.  You're either
> going to get a reply and complete the transaction in under a minute (in
> practice, under 10 seconds) or it's never going to work anyway, so if, for
> example, your firewall is trying to remember sessions for an hour, you're
> just wasting memory and possibly DoSing your firewall.

After spending some quality time with my logs, I do about 1.3 million
kerberos requests a day or 960/min on average.  The incident that took
out the kerberos servers with an additional 600 hits/min (from the krb
logs) doesn't even make a spike on my graphs.  My late morning usage is
higher.

So there's another piece to the puzzle.

Jack
-- 
Jack Neely <jjneely at ncsu.edu>
Linux Czar, OIT Campus Linux Services
Office of Information Technology, NC State University
GPG Fingerprint: 1917 5AC1 E828 9337 7AA4  EA6B 213B 765F 3B6A 5B89


More information about the Kerberos mailing list