KDC HA Failure with krb5-1.9.1 and pam-krb5 4.4

Greg Hudson ghudson at MIT.EDU
Fri Nov 18 17:13:16 EST 2011


On 11/18/2011 02:17 PM, Tom Parker wrote:
> The problem I have is that if I update my client from 1.8.3 to 1.9.1 my 
> High Availability breaks.  A 1.9.1 client will not successfully 
> authenticate if one of my KDCs is down.  My 1.8.3 clients work fine.

Unfortunately, I'm unable to reproduce this.  I created two SRV records,
one pointing to a test server and one pointing to an invalid point, and
1.9.x clients were able to make successful AS and TGS requests to that
zone regardless of whether they tried the incorrect or correct port first.

There weren't very many changes to the KDC location or communication
code between 1.8 and 1.9 (although there were a few mostly innocuous
ones) and there haven't been any since the 1.9 branch.

1.9 supports a new tracing facility which might provide a little
additional information, although it's not possibly to use in concert
with pam_krb5.  But you can use it with kinit or kvno, e.g.:

  KRB5_TRACE=/dev/stdout kinit tparker at LS.CBN
  KRB5_TRACE=/dev/stdout kvno host/arudrdb.ls.cbn at LS.CBN

When I do this in my test scenario, I see stuff like:

[6252] 1321653654.244586: Sending initial UDP request to dgram
18.18.1.59:50799
[6252] 1321653654.244628: UDP error receiving from dgram
18.18.1.59:50799: 111/Connection refused
[6252] 1321653654.244641: Sending initial UDP request to dgram
18.18.1.59:50800
[6252] 1321653654.247219: Received answer from dgram 18.18.1.59:50800

Or, if the wrong entry goes to a black hole, I see (note the one-second
delay between the first and second entries):

[6652] 1321654279.235052: Sending initial UDP request to dgram
18.70.0.200:50799
[6652] 1321654280.236097: Sending initial UDP request to dgram
127.0.1.1:50800
[6652] 1321654280.236639: Received answer from dgram 127.0.1.1:50800




More information about the Kerberos mailing list