krb5.conf implementation question
koloughlin at informatica.com
Mon Apr 13 10:02:59 EDT 2020
That's helpful, thanks for taking the time to get back with me.
> Based on the trace logs, I think the second try is initiated by the MIT code, and the other pair is the result of a second attempt to get credentials by the application.
This is really interesting. I have a support ticket opened with the software vendor. Knowing that they are retrying on their side is helpful, I can encourage them to potentially add a delay before the retry and/or to add more retries. We've seen the period of time that AD responds like this to be a couple of seconds, if they delay their retry, then that would give the AD server that is shutting down time to complete its work, and shutdown, and the retry will hit the next KDC in the list. Of course, this wouldn't be foolproof but if we could reduce the frequency of failures that would be great. Currently we get failures each time the KDC at the top of our list gets rebooted.
> I don't think that would help reliably. Randomization of DNS response order could make the right thing happen some of the time, but not all of the time.
I understand what you're saying. Again, even if it's not a complete solution, but it helps to reduce the failure frequency that's helpful to us. I was thinking we could keep our existing krb5.conf KDC configuration, but add the realm name as an entry at the top as the first KDC entry. Then that would cause the requests to go to different AD servers each time. In the client trace it does indicate that the client resolves the machine name before each of the 4 requests. In the customer's DNS configuration, resolution of the their Kerberos realm name via DNS return the IP addresses of 75 AD servers. From what I can tell, each time a request is made to the DNS server for the Kerberos realm name, it returns the 75 IP address list with the top IP address removed from the list and added to the bottom. It seems to be doing this as a load balancing mechanism probably. If we 4 DNS requests are made for our initial TGS-REQ and the 3 retries, then the odds are very good that most of the time those 4 requests will be made to 4 different KDCs.
Any feedback you might have on these ideas, would be greatly appreciated.
More information about the krbdev