Slow response with multiple KDCs
Ken Raeburn
raeburn at MIT.EDU
Mon Sep 18 18:04:48 EDT 2006
On Sep 18, 2006, at 17:15, petesea at bigfoot.com wrote:
> My Kerberos admins recently changed all the KDCs in our realm and
> started
> distributing a new standard krb5.conf file. Now... instead of
> taking < 1
> sec to get a password prompt from "kinit", it takes 40-50 secs.
>
> The old file lists 6 KDCs using IP addresses instead of hostnames.
> The
> new file lists 10 KDCs using hostnames... so obviously it has
> something to
> do with DNS.
>
> Using krb5-1.4.4, and running strace on kinit, it appears to be doing
> multiple DNS requests for EVERY KDC listed in the krb5.conf file.
> This
> seems to be why it takes so long. In fact... it looks like for 10
> KDCs,
> "kinit" ends up making 316 DNS requests.
That's... fascinating....
> Why does it make so many requests? Why does it make DNS requests
> for ALL
> the KDCs even if the first one returned results. Is this a
> function of
> the kerberos library itself or something else?
You (understandably) didn't include the 316 requests in your message,
but I'll take a shot:
It's probably doubling queries, once in the "look for UDP-accessible
KDCs" path and once in the "look for TCP-accessible KDCs" path (which
don't have to return the same results if the data is coming from DNS
SRV records, but SRV-vs-krb5.conf isn't known at that level).
It may be doubling the address queries per name, one A for IPv4
addresses and one AAAA for IPv6 addresses. (An interesting, um,
quirk, that I've seen in one implementation: If you don't have an
AAAA record for foo.example.com, and you don't put a trailing dot on
the end of the name you look up, it may then look up
foo.example.com.example.com, even though you have other data -- like
an A record -- for foo.example.com. So we could be looking at a
multiplier of three here, or more if you use a domain search list.)
At one point, the library may try to look up the "master KDC" (so if
you get an "incorrect password" type result but were talking to a
slave KDC that may not have your password change from 30 seconds ago,
it then tries a KDC that would have it); offhand, I'm not sure how
many DNS queries that's likely to generate. Here at MIT, we've got a
SRV record for _kerberos_master._udp.athena.mit.edu listing one host,
so we do get one additional lookup for that name. (Oddly, we don't
get two, for A and AAAA; I should look at why that is.)
Try adding "." to the end of your hostnames in krb5.conf, assuming
they're all FQDNs, and see if that helps.
Our library is currently set up to collect all the addresses, and
then start trying to contact servers. Integrating these loops better
has been on the to-do list for a while, but will be kind of ugly.
(We don't want to try one, time out, give up, look up the next
address, try it, etc. We want to fail over to the next KDC fairly
quickly, but we also want to keep listening for responses to earlier
attempts, in case it's just a slow KDC or slow network.)
We do generally assume that local caching will make the occasional
repeated DNS query fast. But 316? Ow.
If you want to send me your config file and strace output (off-list,
I suggest, and run with -s99 or something like that to record the
full query packet), I can take a closer look.
> I've tried setting the following in krb5.conf, but they don't seem
> to make
> a difference:
>
> dns_lookup_realm = false
That controls the use of TXT records for figuring out the realm of a
host; normally disabled.
> dns_lookup_kdc = false
That controls the use of SRV records for finding the KDCs given a
realm name; normally enabled, but having the data in krb5.conf should
prevent it from getting used for the regular KDC lookup. Lookup of
_kerberos-master._udp.REALM.NAME (and likewise for _tcp) would
probably still happen unless you use this option, or specify
"master_kdc=..." in krb5.conf.
> dns_fallback = false
This (poorly-named?) option controls both of the above explicit uses
of DNS, letting you switch both on or off with one option. It
doesn't control the implicit uses of DNS via getaddrinfo or
gethostbyname.
> I've also tried 1.4.3 compiled WITHOUT --disable-dns-for-realm and
> 1.4.4
> compiled WITH --disable-dns-for-realm, but that didn't make a
> difference
> either.
Configuration-time options corresponding to the above control for
TXT queries.
> PS. The reason I'm concerned about this is because I need to build
> a new
> krb5-1.4.4 package to be distributed to all our developers that
> contains
> the new krb5.conf file. I don't want to get a bunch of users
> telling me
> how slow kinit has become.
Perfectly understandable....
Ken
More information about the Kerberos
mailing list