k5start -K and ticket renewals

Nico Williams nico at cryptonector.com
Tue Jan 28 15:00:03 EST 2014


On Tue, Jan 28, 2014 at 12:42 PM, Russ Allbery <eagle at eyrie.org> wrote:
> Nico Williams <nico at cryptonector.com> writes:
>> As to atomicity... the FILE ccache currently depends on POSIX file
>> locking at least for additions of tickets, and this is a disaster
>> because POSIX file locking is a disaster (because of its drop locks on
>> first close semantics).  But yes, *renewal* and refresh should always
>> result in a rename(2) into place, which should be atomic.
>
> By that do you mean that the Kerberos libraries do that, or that I need to
> do that in k5start?

I mean that rename-into-place is what the krb5 libraries ought to do.
It's not what MIT does, MIT truncates to initialize.  Heimdal does
rename in the krb5_cc_move() case, but only in that case, otherwise
Heimdal unlink()s then creates a new ccache to initialize.

> I assume krenew is fine (to the extent that POSIX locking is fine).

Sort of.

POSIX file locking within a single-threaded program can be sufficient,
and it is when that program is truncating and writing a new ccache
under lock.  But you might be racing against threaded programs, and
since the libraries (MIT and Heimdal both) misuse POSIX file locks...

POSIX file locking is utterly useless in a threaded program unless one
also applies other synchronization within that process with great care
to never close(2) a file descriptor for a regular file that another
thread might also have independently opened and locked (see what
SQLite3 does), and even then there are situations where that's
insufficient (namely, when multiple libraries in the same process
might be accessing the same file).

In particular it's possible for a threaded app and krenew to race.
Imagine a thread A in an app acquiring a read lock to read the ccache
and another thread B blocking to get a write lock to add a service
ticket (it's already read the ccache and not found that ticket)...

The thread A finishes its search, drops the lock, then B gets the
lock, then A closes its file descriptor thus dropping B's lock so that
krenew can get its write lock and start writing.  End result: B can
write over krenew's entries.

Most likely B won't write over krenew's entries because most likely B
will have found the end of the ccache farther off than where krenew
would leave it.  Instead B will likely cause a hole between the end of
krenew's entries and B's, but if krenew were to obtain a *larger*
ticket than before, then B could write over krenew's writes.
Regardless, threaded ccache reader-writers can step on each other, so
krenew really plays no special role here.

We've observed ccache corruption that can only (I think) be explained
by POSIX file locking issues in threaded programs.

The only solution is to always rename into place.  See my separate
posts to the list about this.

Nico
--


More information about the Kerberos mailing list