Performance issues with krb5-1.9.1

Jonathan Reams jr3074 at columbia.edu
Mon Aug 8 11:22:38 EDT 2011


We recently upgraded our primary KDC from 1.8.3 to 1.9.1, and within a few hours, performance was so bad that we had to roll back. We're running a plain vanilla instance of kerberos, supporting a variety of clients (versions spanning 1.4-1.8). From the perspective of the KDC, there wasn't any indication of problems, it was issuing tickets just fine, but the response time for a single request got progressively worse, until clients started to fail over to the secondary KDC (also running 1.9.1), and finally major applications that depend on kerberos, like our single-signon web authentication, campus email, and radius for wireless access started to fail outright.

I did some performance testing on our test KDC and was able to reproduce the performance issue with 1.9.1. The following test gets a TGT and does a TGS request 1000 times, 4 times over, with a 5 second sleep between runs:
[root at naiad ~]# for i in `seq 1 4`; do time for j in `seq 1 1000`; do kinit -k; kvno iprop_monitor | grep -v kvno; done; echo finished round $i; sleep 5; done

real	0m29.597s
user	0m4.631s
sys	0m14.094s
finished round 1

real	0m37.726s
user	0m4.669s
sys	0m14.063s
finished round 2

real	0m46.962s
user	0m4.570s
sys	0m14.250s
finished round 3

real	0m56.288s
user	0m4.665s
sys	0m14.083s
finished round 4

The time increases significantly with each run. This does not occur when running 1.8.3:
[root at naiad ~]# for i in `seq 1 4`; do time for j in `seq 1 1000`; do kinit -k; kvno iprop_monitor | grep -v kvno; done; echo finished round $i; sleep 5; done

real	0m23.648s
user	0m4.747s
sys	0m13.969s
finished round 1

real	0m23.089s
user	0m4.688s
sys	0m13.902s
finished round 2

real	0m23.327s
user	0m4.654s
sys	0m13.855s
finished round 3

real	0m23.683s
user	0m4.724s
sys	0m14.005s
finished round 4

Has anyone else seen this kind of performance problem with 1.9.1? Is there anything about the kdc configuration that needs to change between 1.8 and 1.9.1 to prevent catastrophic performance degradation? The secondary KDC is still running 1.9.1 without issue, is there something that changed between 1.8 and 1.9 that effects how the KDC responds under heavy load?

Jonathan Reams
CUIT Systems Engineer
Columbia University





More information about the Kerberos mailing list