Database locking during kprops, MIT 1.8

Jeremy Hunt jeremyh at optimation.com.au
Sun Oct 10 19:46:32 EDT 2010


Hi Dominic,

Thanks for your feedback. You make a good point about reporting a bug. Though my memory is that the Kerberos team knew about them all..

The second issue is as designed, and given that kprop is so efficient, isn't as bad as I first thought when I read about it. Of course your experience does show its downside.

The second issue is reported in the Kerberos team's own documentation. Like I said I haven't seen it, ... yet. I reported it because they have and you might miss the warning.

The first issue, I assumed the kerberos development team already knew of. I will have to go back and look at my notes, but I thought I saw something in the code or the documentation that made me think they knew about it and it wasn't an easy fix. Certainly it behoves me to revisit this and to report it as a bug if my memory is wrong. Like I say, it is an observation that operationally does not affect us, btw a full dump/restore a la kprop will take across profiles.

Regards,

Jeremy

On 8/10/2010 10:42 PM, Dominic Hargreaves wrote:
> On Fri, Oct 08, 2010 at 09:47:50AM +1100, Jeremy Hunt wrote:
>> Hi Dominic,
>>
>> I would recommend having a look at iprop. The update mechanism has less impact than yours.
>>
>> It checks the log timestamps and when it notices a difference between two sysatme it propagates only differences in the logs rather than the whole database. This should be a lot less than 12 secs for you, so your timeout problem should disappear.
> Yup. I think we have a quick workaround now and upgrading our slaves
> is already on our todo list :)
>
>> The provisos I see are:
>> 1. profile changes do not appear to be logged and propagated via iprop.
>> 2. occasionally iprop gets lost and decides to do a full propagation, for that scenario you will get your timeouts, but it will be a lot less frequent than what you are currently getting.
>> 3. it is documented as losing connectivity  occasionally and so may need to be restarted.
>>
>> I am currently in the process of putting this into production and my testing has had pleasing results. I have only been able to force full propagation by generating much heavier loads than we see in our environment. I have not seen 3 at all and 1 is not really an issue for us, it is just something I noticed in testing.
>>
>> You could probably mitigate 1  and 2 by a full propagation independently at a quiet time once a night, week or whatever.
>>
>> You could mitigate 3 by monitoring the logs and restarting occasionally, probably with a script using the new utility kproplog.
> This sounds like useful information for planning such an upgrade;
> thanks! Out of interest have you filed bugs for these issues (1 and 3
> anyway; 2 is as designed and there probably isn't any way round that,
> although increasing the size of the iprop buffer may, AIUI, help).
>
> Dominic.




More information about the Kerberos mailing list