krb5-1.12 - New iprop features - tree propagation & slave notify
Richard.Basch at gs.com
Wed Jan 15 15:35:55 EST 2014
Actually, in thinking about it, given that I am doing a fork/exec of kprop to signal the slave using the kprop dump, but sending an error code, the logic should be in kprop to retry based on error conditions, then return. That way, kadmind/kpropd remain wholly async in their notification.
The RPC only works between kpropd to a local kadmind (in proponly mode) and from kpropd to remote kadmind; nothing else actually implements RPC (which made the implementation "trickier"). As a refresher, kadmind on the master knows about its slaves via who contacts them and when an UPDATE_OK/UPDATE_NIL is provided. Upon the next update kadmind processes, it spawns off a kprop to notify each of the slaves via the kprop (full dump) protocol, but sending an error code just before sending the database (already supported by older versions of kpropd). Upon receipt of this error, kpropd will send a USR1 to the iprop portion of kpropd (the USR1 signal is already supported), to notify the iprop portion of kpropd to expire the timer and poll immediately. In tree-mode, a kadmind exists on intermediate hosts (only supporting the iprop RPC), and kpropd also attempts to signal the local kadmind (iprop interface) of any changes it has processed locally by sending it a NULL RPC. This last piece therefore allows a full tree replication in near real-time, or at least so is the theory (the high load condition is the only thing which I am trying to optimize).
As soon as the fork to spawn kprop succeeds, the client is forgotten (if kprop fails for any reason other than the fork failing), it means the slave will not be notified and will receive no more notifications (but the periodic iprop poll will eventually kick in to resync the slave).
From: Nico Williams [mailto:nico at cryptonector.com]
Sent: Wednesday, January 15, 2014 2:55 PM
To: Basch, Richard [Tech]
Cc: basch at alum.mit.edu; krbdev at mit.edu; ghudson at mit.edu; tlyu at mit.edu
Subject: Re: krb5-1.12 - New iprop features - tree propagation & slave notify
On Wed, Jan 15, 2014 at 6:19 AM, Basch, Richard <Richard.Basch at gs.com> wrote:
> I just figured I would post an update since I managed to see slaves lagging in updates under high load (and wait the iprop interval). Currently, I post the initial notification, but it appears high-change environments still need more.
> My ideal would be to leave the notification async but if the notification fails, implement a parametized retry count before dropping the client.
It's only a ping, right? So keep track of whether a slave was heard from since pinging it and ping those not heard from again up to N times more. (Also, are you using RPC functions for the ping? Are you using the async interfaces?)
If there's a max latency tolerance you have to meet this isn't going to do it. You'll need something with closer to synchronous semantics (e.g., go check all the slaves when you need to know if an update has
replicated) or actual sync semantics (the kadm5 operation does not complete till all the slaves are updated).
> Personally, I am probably going to increase my ulog size since the 2500 restriction is gone (the kdc.conf man page was not updated to reflect such).
Of course :) Make sure you're running 64-bit libkadm5srv consumers *only*.
Also, IMO the ulog design needs to be replaced. It cannot handle very large entries, for example.
More information about the krbdev