Ticket 5338: Race conditions in key rotation

Thu Jun 19 01:59:45 EDT 2008

On 1213824185 seconds since the Beginning of the UNIX epoch
Ken Raeburn wrote:
>

>On Jun 18, 2008, at 17:01, Jeffrey Altman wrote:
>> In order to address this race condition I propose that krb5_send_tgs()
>> be modified to retry against the master KDC whenever there is an
>> error and the error is not one of KRB5_KDC_UNREACH or  
>> KRB5_REALM_CANT_RESOLVE.
>
>What's the actual error that would be returned in such a case?   
>Wouldn't it make sense to do the retry only for one specific error?
>
>A better approach may be to finish up and implement a design that's  
>been discussed before, which is to allow creation of a key in the  
>database without enabling it for use, and then separately enabling it  
>later.  This would also be desirable for services like AFS where you  
>need to distribute a new key to multiple servers before you can start  
>using it.  (This may also relate, at least a bit, to a project Will  
>Fiveash is helping us with, on-the-fly master key rollover.  There are  
>enough special things about the master key that it may not be all that  
>relevant though.)  I think there's been some public discussion in  
>archives somewhere, maybe in the context of set/change pw protocols.   
>I don't recall if server-side (keytab) changes would be needed as well.

So, this is going to be a bit of a long e-mail so let me apologise
in advance.  I think that I'm going to try to structure my thoughts
into a few sections as there are a few rather distinct topics that
are all intertwined.  Failing back to the master on failures is
not simply about key rotation.  I have been thinking about just
implementing the failover to the master at work but I have been
holding off on it because I would rather see it implemented in the
main release than maintain [yet another] independent patch.

It's a little late, so this e-mail may be a bit conversational.

Failing over the Master on all Failures
---------------------------------------

Failing over to the master on failures is an easy and seamless way
to provide a consistent view of a completely up to date Kerberos
database at any time when the master is reachable.  This is not
merely an issue of solving a race here or there.  It is about
designing a system which with as little state as possible can
present a view to its client libraries that allows changes to be
effective immediately [almost all the time] without having the
master KDC have to block changes until each of the slaves has
accepted it.

There is one case in which this is currently done.  Password changes.
It works.

I proposed this in RT 5338 to solve the TGS key rotation race.  It
does this simply.

But there are other changes that are made to the Kerberos database
such as the addition of service principals which would benefit from
failing back to the master in every case.  If we implement this,
then the addition of service keys will appear to be instant instead
of having to suffer through propagation delays.

How about passwords being expired?  IIRC, I've seen this fail after
my password change because the change has not yet propagated to the
slave that I queried in the past.

In fact, I would go so far as to say, that I would not need
incremental propagation if this feature were implemented.  It would
probably be sufficient to sync the KDCs every 5 or 10 minutes or
so given that important changes would appear to be instant.

Or clock skew errors?  Why should a runaway clock on a single slave
cause potentially plant-wide semi-random outages if the rest of
the KDCs are properly in sync?

What happens when a slave gets a little corrupted? How about some
level of DB corruption?  Truncated database?  Configuration errors?
What if a misconfiguration causes the master to fail to propagate
to the slave for some period of time? What if kpropd crashes?  What
if we put a slave into krb5.conf before we bring it live?  What if
a KDC from the wrong realm gets mis-set in our configuration?  Etc.

Failures on slaves should not be considered to be permanent errors
given that we know up front that there is a more canonical source
of truth just a few lines away in the configuration file...

Given that errors on the KDC are actually quite rare in comparison
to valid requests, I think that trying the master will not
significantly or noticeably increase the load on the master.  And
I do not think that we should decide to try to enumerate the kinds
of errors that will cause the clients to fail back to the master.
If anything, we should fail back as a default and then enumerate
errors that we believe to be fatal if a slave considers it to be
fatal such as trying to renew a non-renewable ticket.  Unforseen
errors, though, should default back to giving the master a crack
at it.  In fact, the concept of failing back to the master on
incorrect passwords was a great idea.  The main failure was that
only that was implemented---that is, the default was to consider
slave errors to be permanent rather than defaulting to failing back
to the most up to date DB on any error.

TGS Key Rotation
----------------

So, the rotation of the TGS key is the specific case within key
rotation where clients failing back to the master solves the issue.

Although you could solve the problem by setting the key and then
at some later date enabling it once you know that every slave has
a new copy of the key, this is more complicated and less robust
than simply failing back to the master on errors.

E.g. let's say that I've got 20 slaves.  What does the process
actually look like?  Well, let's see.  I provision the new TGS key.
Then I confirm that each and every one of the 20 slaves has the
new key.  What if one of them is down?  Do I wait?  Do I continue?
What I do is suspend my automated TGS key rotation scheme and wait
for human intervention.  Unless I build in more complicated logic
where if a slave reboots, it must somehow determine that it is out
of sync and fail to start krb5kdc.  This additional logic then adds
its own issues which I need to grok in order to feel comfortable
that I am now provided a KDC infrastructure that delivers the
appropriate SLA to the plant.  Honestly, I'd rather live with the
current short race than start down this path.

The KDC plant must be simple enough that if I am woken up at 4.15am
with a production outage I can simply and efficiently reason about
how it works before I get an espresso.

``Normal'' Service Key Rotation
-------------------------------

Service key rotation is not related to failing back to the master
on all Kerberos failures.

For service key rotation, the answer is not to modify the KDC but rather
change the behaviour of the key rotation to be:

	1.  client queries master KDC to determine highest kvno K,

	2.  client chooses new key,

	3.  client installs key in all relevent keytabs as kvno K,

	4.  client tells master KDC to update to new key.

I don't see what real benefit there is in introducing the state of
such a transaction to the KDC.  It's simply a matter of getting
the ordering correct.  If you have keys on multiple hosts, step 0
will be for the hosts to chat with each other and co-ordinate the
change.  I am going to be putting this together at some point this
year at work.

Basically, it is just a lot simpler if you don't require that the
KDC choose the keys.  Let the kadmin client do it and the problem
makes itself much easier to solve.  The key to designing robust
systems is playing keep-away with state.  Keep the state on the
client and the problem is straight--forward.

Incremental Propagation
-----------------------

I've been considering just using rsync instead.  Obviously, one
would need to lock the DB, copy it to a tmp file and then rsync it
into its final location on the slaves.  It seems a lot simpler than
maintaining our own code to do it and is probably optimal `enough'.

--
    Roland Dowdeswell                      http://www.Imrryr.ORG/~elric/