Ticket 5338: Race conditions in key rotation
Jeffrey Altman
jaltman at secure-endpoints.com
Wed Jun 25 13:06:53 EDT 2008
Jeffrey Hutzelman wrote:
>> 1. the client libraries versions are in sync with
>> the KDC version, or
>>
>> 2. that this decision should be made by the KDC
>> infrastructure.
>>
>> So, as a client here is your flow:
>>
>> 1. you got a TGT with kvno 7 from a slave,
>>
>> 2. there was mutual auth involved---so you know that you
>> got it from a _real_ slave,
>>
>> 3. you present it to another slave and it does not work,
>>
>> 4. now, you _know_ that there exists a slave for which this
>> request would actually work,
>
>
> No, you don't. The service principal may not exist. It may be disabled.
> There may be no mutually-supported enctype. _Your_ principal may be
> expired. Or any of a number of other things might be wrong, which the KDC
> may tell you about with an appropriate error code.
Slave KDC 1 from which I received a TGT with kvno N certainly knows the
key associated with kvno N.
If I went back to Slave KDC 1 and presented the TGT as part of a request
it is not going to come back and tell me KRB5_GENERIC see error text
to then find "key version number not found".
> "No, you are not allowed to have a ticket" means you are not allowed to
> have a ticket, not "go ask mommy if she will give you one".
The worst part about this situation is that the client really has no
idea why the ticket wasn't issued. The client is not getting an error
that explains the reason at all. It is getting a KRB5_GENERIC error.
> The issue here is that you are proposing that the answer to a set of KDC's
> not presenting a consistent view of the realm is not to fix the KDC's, but
> to require clients to query multiple KDC's and compare their responses,
> which is not what the protocol specification says.
The protocol specification doesn't say anything about the subject of
master/slave, primary/secondary, or multi-master distributed KDC
implementations. It doesn't say that clients MUST, SHOULD, SHOULD NOT,
or MUST NOT contact the Master/Primary KDC after a client request fails.
The only reference to "fatal" errors are in conjunction with the TCP
support which states that a client MUST NOT treat a connection break
as a fatal. The only references to "retry" are related to pre-auth
methods that the client SHOULD retry and KRB_ERR_RESPONSE_TOO_BIG that
is a hint to the client to retry but doesn't say so explicitly.
In fact, in section 3.1.6 discussion what a client should do in response
to a KRB_ERROR message, it states that the client interprets it as an
error and performs "whatever application-specific tasks are necessary
for recovery." Perhaps you are interpreting to mean that the krb5
library must pass the error to the calling application and it is the
application's responsibility to decide whether or not to retry.
However, that is not how I interpret this. The library is part of the
application and its job is to handle Kerberos errors internally so that
the application can succeed at the task it is attempting to accomplish.
If that means retrying, so be it.
The KDC deployment architecture selected by an implementation is an
implementation specific detail that is independent of the protocol
specification. Inconsistencies in the deployment of the implementation
specific database are addressed in an implementation specific manner.
The MIT clients already do contact the Master/Primary KDC if one is
defined for the realm when the client's AS request fails for any reason.
This is done explicitly because of the fact that Slave/Secondary KDCs
might not have an up to date view of the world.
This definition is specified either via the use of the "master_kdc"
entry in the realm section of the krb5.conf (the profile) or in DNS SRV
records.
Nico has commented to say that fail over for AS requests are ok and TGS
requests are not because the volume of AS requests is lower than that
for TGS requests. In a properly configured realm, the number of
failures are going to be small compared to the overall volume of
requests to the slaves. This assumption has been verified in real world
production realms. If you have data that disputes this, please provide it.
The problem here is that no matter what you say a KDC based solution
should be, there is no transactional mechanism that you can put in place
on an error prone network that will result in all copies of the database
being consistent all of the time. By adding additional transactional
complexity you can reduce the opportunities for inconsistency but you do
so at the cost of the increase in complexity and the expose to failure
that the complexity introduces.
I certainly do not believe that as a practical matter a client should
ask all of known KDCs in turn for an answer if it doesn't get a
successful response. However, turning to the known authoritative
database to answer the question is perfectly reasonable when it is known
that such an authoritative database exists.
Jeffrey Altman
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3355 bytes
Desc: S/MIME Cryptographic Signature
Url : http://mailman.mit.edu/pipermail/krbdev/attachments/20080625/eb5459e4/attachment.bin
More information about the krbdev
mailing list