Dynamic skew (Re: review of Projects/replay_cache_collision_avoidance, ending Jan. 12)
Nicolas.Williams at sun.com
Tue Dec 30 15:44:23 EST 2008
On Sun, Dec 28, 2008 at 05:04:25PM -0500, Tom Yu wrote:
I've long wondered if we couldn't play with making the skew dynamic to
make the replay cache go faster by avoiding the need to commit the
rcache to stable storage.
On rcache recovery start with a skew of zero and grow it up to 5 minutes
as time passes.
This means that as long as clients and servers have reasonably
synchronized clocks then most outages due to replay cache loss will be
Even better: at boot time you can set the initial skew to half the time
it took to reach that point from the begining of the boot.
If you do this then you don't need to commit the replay cache to stable
storage: by the time the first AP-REQ can arrive you'll be past the time
window during which replays could occur, and typical server
time-to-boot/2 will be a reasonable time skew for client and server
Time to boot is difficult to pin down, unfortunately. There's time
spent on hardware POST, BIOS, ... measurements of which may be
unavailable. To estimate POST and such time one needs to know whether a
boot was cold, warm or fast ("fast reboot" == old kernel loads new
kernel and transfers control without taking a trip through the BIOS).
So the conservative thing to do is to ignore time spent in POST, BIOS,
... and focus only on kernel start->rcache creation time. Unfortunately
there are no portable APIs that I know of for obtaining time to boot
from the point the OS kernel started keeping time.
To implement a dynamic time skew all you need is an entry in the rcache
that tells you the: a) the time it was created, b) the skew at the time
the rcache was created. The the current skew is trivial to calculate.
(a) is also trivial to determine (whoever creates the rcache knows).
(b) can trivially default to zero, and if you have a way to estimate
time-to-boot conservatively, then you can have an rc script (or SMF
service) to create the rcache at boot time with skew == half the time it
took to reach that point from boot.
If you ignore time-to-boot and always default the initial skew to zero
then this is just so trivial to implement that you might as well do it
while implementing the replay_cache_collision_avoidance project, with an
option to control whether to fsync() the rcache that defaults to yes.
More information about the krbdev