Kerberos transport DNS record design

Fri May 27 14:08:24 EDT 2016

None of the backends provides a guarantee that all slaves are updated; in fact it is very likely (but not a guarantee) that *no* slave has been updated at that point, regardless of backend. LDAP and Heimdal's iprop would be fastest to update afterward, depending on configuration, and most probably LDAP can be tuned to minimize the time: LDAP does not rely on relatively long delays (measured in minutes instead of seconds) to validate lock-free update semantics, as the KDC-based replications do.

Otherwise, you do make good points about continued utility of the fallback in edge cases.

-----Original Message-----
From: Roland C. Dowdeswell [mailto:elric at imrryr.org] 
Sent: Thursday, May 26, 2016 10:59 PM
To: Brandon Allbery <ballbery at sinenomine.net>
Cc: Greg Hudson <ghudson at mit.edu>; Nathaniel McCallum <npmccallum at redhat.com>; krbdev at mit.edu
Subject: Re: Kerberos transport DNS record design

On Thu, May 26, 2016 at 05:25:19PM +0000, Brandon Allbery wrote:
>

> - For either one, using LDAP backend with its replication facilities 
> makes fallback unnecessary.

Does the LDAP backend promise that all of the slaves are up to date before kadm5_create_principal() returns successfully?  If not, there will still be a race condition.  We are currently starting to see our developers deploy new services using VMs and containers in an on-demand fashion on a large scale expecting to be able to hit those services the moment the master KDC reports that the principal has been created.

I also quite like the master failover because it provides additional resiliency against things like: a slave's clock going out of skew; an administrator truncating the Kerberos DB on a slave; a slave running out of disk space and somehow rubbishing the KDB causing the KDC to respond inappropriately; or an administrator of DNS (often people in a different group in the organisation) messing up the one of the SRV RRs to point at a different realm's KDC.  I suppose that I don't think that it makes sense for the client to simply give up at the first sign of errors when there is a decent chance that one of the other KDCs may very well have a better answer.  Remember that in many environments, almost all requests to the KDCs should succeed, after all: if I'm making a TGS_REQ for a service then in general I've connected to the service in question and it has asked me to use GSSAPI/Kerberos authentication (at least in the case of protocols like HTTP/Negotiate, SSH, SASL where GSSAPI auth is negotiated.)

--
    Roland Dowdeswell                      http://Imrryr.ORG/~elric/