Rate limiting Kerberos Requests
Russ Allbery
rra at stanford.edu
Tue Sep 25 17:08:29 EDT 2012
Jack Neely <jjneely at ncsu.edu> writes:
> Thanks for reading between the lines. I don't have evidence that my
> KDCs were overloaded, yet I got quite a few cannot reach KDC errors and
> a logins stopped working everywhere.
> The slaves are HP G7 blades with 12GB of RAM and a 6 core Intel Xeon. 2
> servers in one DC and the other slave (and master) in the other DC.
> Each DC has its own firewall/vlan for the kerberos servers. RHEL 5
> running kerb 1.6.1.
> My network engineers tell me that the firewall in one DC had 8000
> concurrent connections from the offending IP address to the KDCs and
> 4000 in the second DC. (Oddly, the DC with only 1 slave.) The KDCs
> weren't able to handle other requests until the spike settled.
Is it possible that, rather than overwhelming the KDC, you instead
overwhelmed the UDP session table on your firewall? Sometimes firewalls
have surprisingly small UDP session tables, which can cause serious
problems for Kerberos and for DNS servers.
You're right that there are ill-behaved Kerberos applications that will
spam authentication requests, but I tend to think of this similar to the
problem with DNS, where there are ill-behaved resolvers that do the same
thing. Fixing them tends to be really hard, but answering Kerberos
requests should normally be extremely fast. It's usually easier to just
ensure you can handle the load spikes than worry too much about fixing all
the broken clients. (Of course, the rate limiting path that you're going
down is one way to do that.)
We were quite concerned when we first looked at putting Kerberos KDCs
behind a hardware firewall because of that session limit. Our firewalls
have a 100,000 UDP session limit and a fairly quick timeout. One tuning
that you can do on the hardware firewall, if that is the problem, is to
reduce the UDP session length for Kerberos KDC traffic. You're either
going to get a reply and complete the transaction in under a minute (in
practice, under 10 seconds) or it's never going to work anyway, so if, for
example, your firewall is trying to remember sessions for an hour, you're
just wasting memory and possibly DoSing your firewall.
--
Russ Allbery (rra at stanford.edu) <http://www.eyrie.org/~eagle/>
More information about the Kerberos
mailing list