krb5-1.12 - New iprop features - tree propagation & slave notify
Richard.Basch at gs.com
Tue Jan 21 08:06:20 EST 2014
I think I just figured out the issue... My code does not queue the client if the server is responding with UPDATE_BUSY. Originally, I thought there was no code path which still returned UPDATE_BUSY, but it appears the client gets dropped when this condition is hit under high load.
I will work on a patch revision to account for this condition. This should resolve the situation without any other changes being required. My logs indicate UPDATE_BUSY still is provided (though I have to re-examine the code base to see how, since I didn't immediately see how with 1.11+ after Nico's patches were adopted).
From: Nico Williams [mailto:nico at cryptonector.com]
Sent: Wednesday, January 15, 2014 2:55 PM
To: Basch, Richard [Tech]
Cc: basch at alum.mit.edu; krbdev at mit.edu; ghudson at mit.edu; tlyu at mit.edu
Subject: Re: krb5-1.12 - New iprop features - tree propagation & slave notify
On Wed, Jan 15, 2014 at 6:19 AM, Basch, Richard <Richard.Basch at gs.com> wrote:
> I just figured I would post an update since I managed to see slaves lagging in updates under high load (and wait the iprop interval). Currently, I post the initial notification, but it appears high-change environments still need more.
> My ideal would be to leave the notification async but if the notification fails, implement a parametized retry count before dropping the client.
It's only a ping, right? So keep track of whether a slave was heard from since pinging it and ping those not heard from again up to N times more. (Also, are you using RPC functions for the ping? Are you using the async interfaces?)
If there's a max latency tolerance you have to meet this isn't going to do it. You'll need something with closer to synchronous semantics (e.g., go check all the slaves when you need to know if an update has
replicated) or actual sync semantics (the kadm5 operation does not complete till all the slaves are updated).
> Personally, I am probably going to increase my ulog size since the 2500 restriction is gone (the kdc.conf man page was not updated to reflect such).
Of course :) Make sure you're running 64-bit libkadm5srv consumers *only*.
Also, IMO the ulog design needs to be replaced. It cannot handle very large entries, for example.
More information about the krbdev