Kerberos, DNS and AAAA records

Thu May 21 15:28:27 EDT 2009

On May 21, 2009, at 13:25, Ravi Channavajhala wrote:
> I maintain a rather large site, where there are more than a dozen KDCs
> across different locations.  Recently, I configured Windows 2003-R2/AD
> as the central source of authentication for lot of Linux and Unix
> servers.  The issue I'm facing here is the user logons are really
> slow.  Capturing network traffic and looking at it, reveals the above
> behavior.  Now, can you please help me understand what you mean by
> "going through list repeatedly"?  Does this mean the querying is done
> simultaneously to several KDCs in parallel?

The simple version is, we fire off a UDP message to one KDC, and after  
one second if we haven't heard back we assume the KDC is probably  
unreachable or offline and send a message to the next KDC, and so on.   
However, in case the KDC is just being slow, we keep listening for  
responses even after we've moved on to the next KDC, unless we get  
back some kind of "port unreachable" indication.  In case we're having  
connectivity or packet-loss problems, we make a total of three passes  
through the list, with a little delay in between passes, resending UDP  
messages each time through.  For TCP, it's a little different -- we  
tell the kernel to start a connection in non-blocking mode, and if it  
connects, we start sending data, but the "quit" condition for getting  
out of the loop is successfully sending all of the data and getting a  
complete response back.  In the passes after the first, we don't do  
anything new for TCP, just keep trying to send and receive data.

Once the name resolution is done, the worst-case timeout for the  
overall operation (if there are no responses including no host- 
unreachable errors back) should be according to the comment in src/lib/ 
krb5/os/sendto_kdc.c:

  * Per UDP server, 1s per pass.
  * Per TCP server, 1s.
  * Backoff delay, 2**(P+1) - 2, where P is total number of passes.
  *
  * Total = 2**(P+1) + U*P + T - 2.
  *
  * If P=3, Total = 3*U + T + 14.

Of course, if getting the DNS data takes us a long time, that may  
dominate, and we haven't done much to improve that, though I've  
thought about some of these issues before.

It's a somewhat clunky mechanism that hasn't really been tuned using  
real network data, but most of the time it seems to do okay.  We have  
had complaints that we should be able to impose an overall total  
timeout.  And occasionally doing the serialized DNS queries before any  
of the connection attempts is a problem at some sites.  Like I said  
earlier, we can look at integrating the DNS lookups and the contact  
loops so each host address is looked up when we first want to contact  
it, but only if the extra complexity is really going to help.  Doing  
DNS queries asynchronously does not seem to be something we can do  
portably at the moment.

We don't want to actually fire off all the queries at the same time,  
because usually a nearby KDC *will* respond quickly.  Sending off all  
the queries at once will make all the KDCs do the same work each time,  
even though only one response is needed, and eliminating any load- 
sharing benefit.

If SRV records are used, hosts listed with equal priority are used in  
random order, as per the spec, since the Kerberos library has no  
additional information for sorting them by proximity.  We also have no  
hooks at the moment for figuring out and recording how responsive any  
given KDC is, to optimize later queries.  (A patch to allow some  
simple optimizations might be acceptable to MIT.  Some possible  
heuristics: Scan the local network interfaces and put anything on a  
directly-attached network ahead of anything further away.  Check an  
entry in the config file for network blocks listed in priority order,  
e.g., "18.0.0.0/8 2001:4830:2446::/48", so the local site can be  
described.  Or, you can try using the service-location plugin  
interface in the library to provide code to order things however you  
like, and maybe experiment with some heuristics without having to  
recompile the krb5 libraries.)

You could set up the config files differently at each location,  
putting the nearby KDCs at the top of the list, and maybe only listing  
some of the others.  You could define a DNS name that maps to multiple  
addresses for several KDCs and list that as a lower-priority KDC to  
use as a fallback; that'll reduce the number of DNS queries needed,  
but if you've got local name servers at each site, caching should make  
the name lookups reasonably efficient.  You could also play games with  
anycast addresses to find the nearest KDC out of a set (either all  
your KDCs, or broken into two or three subsets if there are a lot),  
though that's probably serious overkill.

> Also, we dont use SRV/TXT for kdc/realm identification in DNS and I
> dont explicitly specify the dns_lookup in the krb5.conf.  In this
> context the dns_fallback automatically gets enabled, I'm thinking.
> What is the consequence of dns_fallback defaulting to yes?

If you don't explicitly specify KDCs for a realm, then DNS SRV records  
will be looked up.  If you do specify the KDCs, then SRV records won't  
be used; only those KDCs will be used, and they'll be tried in the  
order you indicate in the file.

-- 
Ken Raeburn / raeburn at mit.edu / no longer at MIT Kerberos Consortium