Concurrency issues with FILE ccache
Greg Hudson
ghudson at mit.edu
Fri Apr 9 14:24:04 EDT 2021
On 4/9/21 11:35 AM, Osipov, Michael (LDA IT PLM) wrote:
> I am quite sure that this is a race condition where stat() is performed,
> file does not exist, open() with write is performed, in parallel it is
> already created and the later call returns in EEXIST.
I agree, except I think it's just unlink() and open(O_CREAT|O_EXCL)
calls with no stat(). I had erroneously assumed that the unexpected
error was happening inside fcc_store() because of "Failed to store
credentials" in the message, but that string turns out to be from
get_in_tkt.c in a block of code that also calls krb5_cc_initialize().
The fcc_initialize() EEXIST self-race has existed since 1.0. I'd
speculate that the original developers' assumption was that lots of
processes might be competing to use a file ccache, but that creating
ccaches would be a rare and one-at-a-time affair (happening at login or
when a user runs "kinit"). With client keytab support, that is no
longer the case; it's easy to have multiple threads or processes
competing to create or refresh a cache as part of gss_acquire_cred() or
gss_init_sec_context().
Just fixing the fcc_initialize() race wouldn't really solve the problem;
there would still be a window between krb5_cc_initialize() and
krb5_cc_store_cred() where other threads (or processes) would see an
initialized cache with no TGT in it, and would fail the
gss_init_sec_context() call. This ticket describes that problem and
some possible solutions:
https://krbdev.mit.edu/rt/Ticket/Display.html?id=7707
Heimdal has implemented option 5. I'm not wild about it and it won't
work with other ccache types, but it's a working stopgap and it can
always be backed out in favor of a different solution later.
More information about the Kerberos
mailing list