krb5.conf implementation question
ghudson at mit.edu
Fri Apr 10 22:10:55 EDT 2020
On 4/10/20 5:40 PM, O'Loughlin, Kieran wrote:
> I'm not developing an application using the MIT Kerberos libraries, but am implementing a third-party application that uses the libraries. So, I hope it's ok to be emailing this list.
kerberos at mit.edu would be more appropriate, but it's not a big deal.
> The problem happens when the first KDC in the list is rebooted. There is a short window where the KDC will respond with KDC_ERR_C_PRINCIPAL_UNKNOWN or KDC_ERR_S_PRINCIPAL_UNKNOWN (probably depending on type of request) as it is shutting down. Microsoft has an article about this type of behavior for 2008 SP2 https://support.microsoft.com/en-us/help/982801/a-domain-controller-returns-the-no-such-user-0xc0000064-status-code-or, but we're seeing the same symptoms with 2008 R2, 2012 R2 and 2019.
That's interesting; I haven't heard of this particular issue with
Microsoft's KDCs before, and it sounds like Microsoft thinks they fixed
the problem ten years ago. Unfortunately I don't think you'll be able
to resolve the issue on the MIT client side without code changes.
> When we enable KRB5_TRACE we see up to 4 request attempt being made. I don't know if the multiple tries are initiated by the MIT code or by the application code.
Based on the trace logs, I think the second try is initiated by the MIT
code, and the other pair is the result of a second attempt to get
credentials by the application.
The second try in the MIT code is a fallback in case the KDC doesn't
> * If the MIT code is making the 4 request attempts, is there any way (krb5.conf configuration, env variables, etc.) that we could force each retry to use a different KDC entry in the krb5.conf. This would move the retries away from the server that is rebooting and the first of those should get a good response.
We only walk down the KDC list when the first KDC fails to respond.
> * As mentioned above we are listing out the individual KDC machines, would it be better to set up the krb5.conf in a different way, perhaps using DNS to find the KDCs?
I don't think that would help reliably. Randomization of DNS response
order could make the right thing happen some of the time, but not all of
> * I saw a mention in one email about setting master_kdc, that suggested if there is an error a subsequent request might be sent to the master_kdc. However the documentation says this only happens on an invalid password. Is that the case or is it worth setting master_kdc? We don't do that currently.
master_kdc currently only applies to AS requests, and this is a TGS
request, so that wouldn't help. Also, it obviously wouldn't help the
KDC listed in master_kdc was the one shutting down.
More information about the krbdev