Database locking during kprops, MIT 1.8

Jeremy Hunt jeremyh at optimation.com.au
Thu Jun 23 20:40:33 EDT 2011


That is good news Greg.

 From my experience I can add two comments which I have hesitated to add so far because they are not from fully analyzed tests. However I have checked the logs thoroughly and these are my conclusions.

1. My clients have been running a kerberos master and slave in geographically isolated locations (over a WAN) for three months now. Their network has occasional glitches that may cause the sort of drop outs mentioned in the proviso, and no slave hangs have occurred. The incremental propagation works flawlessly. There is a continual churn of passwords that must be changed by tens of thousands of users every month. There has been no full propagation since the new servers went live.
2. Before going live during the initial setup phase, the system administrator had mismatching keytabs on the master and slave and so the slave's kiprop process could not authenticate itself when trying to get a full update from the kadmin process on the master. It appears that the kiprop process continually requested updates (as in a loop where it tried, failed and tried again) for the period it took for the problem to be noticed, diagnosed and solved, which took some 8 hours. Though this is not the result of a deliberate test, the observed symptoms mirror the description of the fix that Greg outlined.

Kind Regards,

Jeremy

Greg Hudson wrote:
> On Wed, 2011-06-22 at 18:01 -0400, John Devitofranceschi wrote:
>> I was wondering if there was any more clarity to be had around 'issue
>> 3' here.
> Issue 3 is about the install guide's proviso that:
>
>     * The "call out to `kprop'" mechanism is a bit fragile; if the
>       `kprop' propagation fails to connect for some reason, the process
>       on the slave may hang waiting for it, and will need to be
>       restarted.
>
> After investigation, I believe this proviso is outdated.  In December
> 2008, Ken committed some changes from Shawn Emery to add a timer to the
> loop which waits for a kprop connection.  So the slave should eventually
> return to the iprop main loop if the master fails to connect for a full
> dump.  (The slave will then probably try to initiate a new full dump.)
>
> Those changes should have landed in 1.7.  I will update our install
> guide to remove the proviso.
>
>
>




More information about the Kerberos mailing list