Kerberos, DNS and AAAA records

Thu May 21 10:43:05 EDT 2009

On May 21, 2009, at 10:11, james bardin wrote:
> Doing a kinit on my box, just ran 73 dns queries! If there's a problem
> effecting dns, this severely impacts some systems. Also, a large bulk
> of these are AAAA queries, with the domain name appended twice. The
> first AAAA query is sent with the trailing '.', so I'm not sure why
> there is a second attempt for domain.domain.

This is probably a result of specifying KDC names in krb5.conf without  
the trailing ".", the standard notation for indicating a fully- 
qualified name.  If the trailing dot isn't included, typically the DNS  
library software will follow the DNS search path (which in the typical  
case just involves appending the local domain, but more than one  
domain can be given) trying to "complete" the name and find the first  
one that exists.  So sometimes you can see queries for  
"host.foo.com.foo.com" resulting from having "host.foo.com" (no  
trailing dot) given.  If you add the trailing dot in the config files,  
that should fix that problem.  If you use DNS SRV records instead of  
config file entries, the hostnames there are defined to be fully- 
qualified, so again this problem should go away -- though you've added  
back some DNS queries for the SRV records.

There have also been bugs in some of the getaddrinfo() library  
implementations where the searches for A and AAAA records are done  
completely independently, and each search is done until some name  
returns data.  Which means maybe "host.foo.com" has an A record, so  
the search for A records stops, but "host.foo.com" doesn't have a AAAA  
record, so "host.foo.com.foo.com" (and maybe  
"host.foo.com,searchdomain2.com" and "host.foo.com.searchdomain3.com")  
may also get looked up.  It *should* stop at the first name that  
returns *either* an A or AAAA record (in the cases where both address  
types are wanted, which is the case in most but not all of the MIT  
krb5 code).  If that's not the behavior you're seeing, you might want  
to file a bug report, in addition to adding the trailing dot in the  
config file or switching to SRV records.

There are probably also cases where we look up names more times than  
is necessary, because of the current structure of the code.  In many  
system configurations a local name service cache makes this reasonably  
efficient anyways.  But we should try to eliminate some of that  
redundancy.  We do have an in-library cache for some DNS data in case  
there's no local cache, but I think it's currently only enabled for  
some platforms, and I don't think it caches negative responses.

> Why does every kerberos call need to lookup every kdc in the config
> file, and not just the server which is going to be queried, and is
> this configurable?

It's not going to only talk to one of them; it'll go through the list  
repeatedly, trying each until it gets an answer, or times out.  Again,  
it's a matter of the structure of the code -- we get a list of  
addresses and then loop over the list.  We could restructure it to  
look up the address when first needed, i.e., the first time we try to  
reach each server, but that'll add complexity to already complicated  
routines.  (We juggle multiple file descriptors, so if we don't get a  
response back promptly from the first server address and decide to go  
try the next one, we still keep listening for a response from the  
first in case it's just slow and not actually unreachable.  But that  
means we're eventually managing up to one file descriptor per server  
address, some UDP, some TCP.)  There are also asynchronous name-lookup  
techniques, but I think the most portable versions require  
multithreading support and creation of threads, which capabilities  
we're not requiring of the OS and application at present.

-- 
Ken Raeburn / raeburn at mit.edu / no longer at MIT Kerberos Consortium