Performance issues with krb5-1.9.1

Chris Hecker checker at d6.com
Tue Aug 9 15:39:46 EDT 2011


Ah, yeah, my tests had krb5kdc at about 50% of one core (slapd was an 
additional 15%), but it wasn't completely saturating the machine.

Glad the patch fixed it!

Chris

On 2011/08/09 07:13, Jonathan Reams wrote:
> Chris,
>
> We didn't actually see any problems either until the KDC was under heavy load. The unpatched version of 1.9.1 was and still is running on our secondary KDC without issue, and we had been using 1.9.1 in testing and development for months without issue as well. During the period where we saw the performance degradation, the primary KDC handled 467000 distinct AS/TGS requests. Which means the KDC was handling roughly 43 requests per second (not counting lots of retransmits). That is typical of our primary production KDC's workload throughout the day, but we don't have any other KDC that gets that amount of traffic; by contrast, our secondary KDC gets a request once or twice a minute. So it would seem the performance problem only really comes into play when the KDC is under heavy load.
>
> Jonathan
>
> On Aug 9, 2011, at 4:23 AM, Chris Hecker wrote:
>
>>
>> Just another data point:  I'm not seeing this on my locally built (but
>> not with the attached patch) 1.9.1:
>>
>> real    0m41.409s
>> user    0m3.358s
>> sys     0m3.683s
>> finished round 1
>>
>> real    0m35.036s
>> user    0m3.441s
>> sys     0m3.658s
>> finished round 2
>>
>> real    0m44.344s
>> user    0m3.363s
>> sys     0m3.728s
>> finished round 3
>>
>> real    0m40.930s
>> user    0m3.465s
>> sys     0m3.973s
>> finished round 4
>>
>> I had to reduce the number of inner iterations to 300 because my machine
>> is slow.  The variance in the above numbers is because there's a bunch
>> of stuff running on this machine.
>>
>> Chris
>>
>> On 2011/08/08 11:21, Greg Hudson wrote:
>>> On Mon, 2011-08-08 at 11:22 -0400, Jonathan Reams wrote:
>>>> I did some performance testing on our test KDC and was able to
>>>> reproduce the performance issue with 1.9.1.
>>>
>>> I found a regression which would affect these tests, but I'm not sure it
>>> accounts for your global performance issues.
>>>
>>> The KDC in krb5 1.9 isn't supposed to be using an on-disk replay cache,
>>> but due to a bug, it is actually opening and reading a replay cache for
>>> every TGS request, which is significantly less efficient than the 1.8
>>> behavior (using a replay cache which stays open for the lifetime of the
>>> KDC).
>>>
>>> In a test which runs in under five minutes, this regression produces
>>> visible O(n^2) performance characteristics.  This would not necessarily
>>> account for performance degradation over hours, as the performance drag
>>> of the replay cache should become stable after five minutes.  It's
>>> possible that the constant drag was enough to cause the KDC to fall
>>> behind on the request load, but it's also possible that there's a second
>>> problem which isn't so easily reproduced.
>>>
>>> I've attached a patch.  Note that there is a second, in-memory
>>> "lookaside" cache with O(n^2) performance characteristics in the short
>>> term, which holds queries for up to two minutes.  You may see a slight
>>> degradation in performance in test cases due to this.  You can
>>> temporarily rebuild the kdc directory with "make clean;
>>> CPPFLAGS=-DNOCACHE" if you want to remove this variable from your
>>> performance tests.
>>>
>>>
>>>
>>>
>>> ________________________________________________
>>> Kerberos mailing list           Kerberos at mit.edu
>>> https://mailman.mit.edu/mailman/listinfo/kerberos
>> ________________________________________________
>> Kerberos mailing list           Kerberos at mit.edu
>> https://mailman.mit.edu/mailman/listinfo/kerberos
>>
>
>
> ________________________________________________
> Kerberos mailing list           Kerberos at mit.edu
> https://mailman.mit.edu/mailman/listinfo/kerberos
>



More information about the Kerberos mailing list