daily latency spike

Paul B. Henson henson at acm.org
Sun Nov 1 21:15:09 EST 2015


We currently have two kerberos realms, each consisting of three systems
(1 physical box and 2 virtual machines). For a while now we've been
having an issue where once a day, almost exactly every 24 hours, the two
physical boxes have a latency spike and don't respond to authentication
requests, to the point where our monitoring systems alarms that they're
down. Interestingly, this daily outage seems to happen every 24 hours
based on when the services were last restarted.

We see a small spike on io requests that seems to correspond with the
delayed authentication processing. The two vm's in each realm also see
similar spikes, but the SAN they are on has loads of iops and they don't
seem to have an issue continuing to respond to requests during the
spike.

We're using openldap with the ldap backend. Is there some type of once
daily housekeeping the kdc does that would cause this io spike and
response time issue? If so, is there some way to tune it as not to be so
disruptive, or would we need physical boxes with better IO capacity to
avoid the daily delay?

Thanks...



More information about the Kerberos mailing list