kadmin incremental propagation full resync multiple processes spawned
Paul B. Henson
henson at acm.org
Fri Nov 4 18:39:41 EDT 2011
On 11/3/2011 8:26 PM, Greg Hudson wrote:
> On 11/03/2011 05:27 PM, Paul B. Henson wrote:
>> could it be that kdb5_util is failing to lock the database because
>> kadmind is monopolizing it resulting in the dump failing?
>
> That's a good theory.
So I wouldn't be having this problem if the propagation wasn't failing
back to a full resync...
As I understand it, when a client requests incremental updates, if there
have been any changes made on the server within the last 10 seconds, it
tells the client to try again later. If a client requests incremental
updates, and its current serial number predates the update log, it's
told to do a full resync.
About three times a quarter we do some batch processing that runs for a
couple of hours expiring or deleting principals. Generally that is when
it fails back to a full resync. We create new accounts nightly, I
suppose sometimes that might also take a while if there are a lot.
It seems the current mechanism is somewhat deficient on a busy server.
Hypothetically, if you had such a busy server that some update always
happened within a 10 second window, you would *never* get to do
incremental propagation, it will always fall back to a full resync on an
ongoing basis. In my case, it would be nice if it could still do some
occasional incremental propagation during our batch jobs so the slaves
would keep up with changes and not trigger a full resync.
I guess the reason for the implementation is to minimize the number of
incremental transfers? Is that really an issue? What if in addition to
the busy timeout (which ideally could be a configurable option) there
could also be a "do it anyways if it has been X seconds since the last
transfer"? I'd probably set that to 5-10 minutes to prevent the slaves
from being too stale, and to make sure incremental continues to work in
the face of a high update rate. Or possibly a "do it anyways if the
current slave serial number is X close to exceeding the update log"? The
latter might actually be more reliable in terms of avoiding full resyncs.
--
Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst | henson at csupomona.edu
California State Polytechnic University | Pomona CA 91768
More information about the Kerberos
mailing list