From aglo at umich.edu Thu Feb 20 16:47:50 2025 From: aglo at umich.edu (Olga Kornievskaia) Date: Thu, 20 Feb 2025 16:47:50 -0500 Subject: is krb5_cc_initialize() thread safe Message-ID: Hi folks, I haven't been able to find in documentation whether or not krb5_cc_initialize() is thread safe? From a practical point of view, I believe it is not. But a call to "krb5_is_thread_safe() returns true". In NFS's gssd implementation, while trying to refresh credentials, the code tries to store creds and as part of calls into krb5_cc_initialize() to initial credential cache (of either FILE or MEMORY type). We've observed that if two threads try to update the same cache and thus call into krb5_cc_initialize(), one succeeds and the other fails with "Internal credentials cache error". Is that unexpected? Thank you. From kenh at cmf.nrl.navy.mil Thu Feb 20 18:21:44 2025 From: kenh at cmf.nrl.navy.mil (Ken Hornstein) Date: Thu, 20 Feb 2025 18:21:44 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: References: Message-ID: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> >I haven't been able to find in documentation whether or not >krb5_cc_initialize() is thread safe? From a practical point of view, I >believe it is not. But a call to "krb5_is_thread_safe() returns true". > >In NFS's gssd implementation, while trying to refresh credentials, the >code tries to store creds and as part of calls into >krb5_cc_initialize() to initial credential cache (of either FILE or >MEMORY type). We've observed that if two threads try to update the >same cache and thus call into krb5_cc_initialize(), one succeeds and >the other fails with "Internal credentials cache error". Is that >unexpected? It sure seems like it's supposed to be thread safe. On a relatively modern MIT Kerberos, in krb5_mcc_initialize() (the initialize function for the MEMORY cache): k5_cc_mutex_lock(context, &d->lock); ret = init_mcc_cache(context, d, princ); k5_cc_mutex_unlock(context, &d->lock); But digging into this further, the only way init_mcc_cache() can fail is if krb5_copy_principal() fails, and there's no way that can return "Internal credential cache error". So if you're really getting that from a MEMORY cache, I don't see how that's possible (it doesn't seem like there is any code in the memory cache that returns that error). In fcc_initialize() (the equivalent function for the FILE cache) the whole function has a krb5_cc_mutex_lock()/krb5_cc_mutex_unlock() around the guts of the function, and the actual cache file is locked as well. The FILE cache code can return KRB5_FCC_INTERNAL, but only if a system call returns a set of specific error codes, which right now are: a case EINVAL: case EEXIST: case EFAULT: case EBADF: #ifdef EWOULDBLOCK case EWOULDBLOCK: #endif ret = KRB5_FCC_INTERNAL; --Ken From ghudson at mit.edu Thu Feb 20 18:25:15 2025 From: ghudson at mit.edu (Greg Hudson) Date: Thu, 20 Feb 2025 18:25:15 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: References: Message-ID: <87a5fae5-4933-430c-b8d2-ed62f0530386@mit.edu> On 2/20/25 16:47, Olga Kornievskaia wrote: > I haven't been able to find in documentation whether or not > krb5_cc_initialize() is thread safe? From a practical point of view, I > believe it is not. But a call to "krb5_is_thread_safe() returns true". krb5_is_thread_safe() just reports whether the library was compiled with mutex calls. krb5_cc_initialize() is thread-safe to the extent that it shouldn't exhibit memory errors if called in multiple threads with the same krb5_ccache (and different krb5_contexts, as krb5_context is not a thread-protected object). It is inherently not very concurrency-safe, whether the concurrency is due to threads or processes, because at the end of the call the cache is empty (no TGT) and therefore not usable to obtain service tickets. (Well, "inherently" according to the traditional semantics. Heimdal has adopted a behavior change to krb5_cc_initialize() where initializing a cache handle isn't externally visible until the first non-config credential is stored. We didn't go that route.) We did add a way in 1.20 to atomically replace a credentials cache of most types: krb5_cc_move() a MEMORY cache onto the target. This method is used automatically by gss_store_cred()/gss_store_cred_into(), and by krb5_get_init_creds_*() when an output cache is specified using krb5_get_init_creds_opt_set_out_ccache(). The commit that added this behavior is: https://github.com/krb5/krb5/commit/371f09d4bf4ca0c7ba15c5ef909bc35307ed9cc3 > In NFS's gssd implementation, while trying to refresh credentials, the > code tries to store creds and as part of calls into > krb5_cc_initialize() to initial credential cache (of either FILE or > MEMORY type). We've observed that if two threads try to update the > same cache and thus call into krb5_cc_initialize(), one succeeds and > the other fails with "Internal credentials cache error". Is that > unexpected? In this scenario the two threads a probably using separate krb5_ccache handles to perform the initialize and store operations, so libkrb5 locking the handles doesn't prevent the operations from trying to do their work at the same time--the same as if two different processes running two different programs were trying to initialize and store to FILE caches with the same filename. I can see one order of sub-operations which would generate that error message, right off the bat: we start by unlink()ing the file and then opening it with O_EXCL. If these operations are interleaved (T1 unlink, T2 unlink, T1 open, T2 open), the second open() will return an EEXIST error, which is translated to KRB5_FCC_INTERNAL. From aglo at umich.edu Thu Feb 20 19:31:49 2025 From: aglo at umich.edu (Olga Kornievskaia) Date: Thu, 20 Feb 2025 19:31:49 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> References: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> Message-ID: On Thu, Feb 20, 2025 at 6:21?PM Ken Hornstein wrote: > > >I haven't been able to find in documentation whether or not > >krb5_cc_initialize() is thread safe? From a practical point of view, I > >believe it is not. But a call to "krb5_is_thread_safe() returns true". > > > >In NFS's gssd implementation, while trying to refresh credentials, the > >code tries to store creds and as part of calls into > >krb5_cc_initialize() to initial credential cache (of either FILE or > >MEMORY type). We've observed that if two threads try to update the > >same cache and thus call into krb5_cc_initialize(), one succeeds and > >the other fails with "Internal credentials cache error". Is that > >unexpected? > > It sure seems like it's supposed to be thread safe. On a relatively > modern MIT Kerberos, in krb5_mcc_initialize() (the initialize function > for the MEMORY cache): > > k5_cc_mutex_lock(context, &d->lock); > ret = init_mcc_cache(context, d, princ); > k5_cc_mutex_unlock(context, &d->lock); > > But digging into this further, the only way init_mcc_cache() can fail > is if krb5_copy_principal() fails, and there's no way that can return > "Internal credential cache error". So if you're really getting that > from a MEMORY cache, In my testing I've had gssd setup use the "default" ccache type which is FILE. I haven't tried if setting it use_memory (switching to MEMORY) works better. But regardless, gssd needs to do something "better" for the case of FILE credential type and I'm trying to figure out what that should be. > I don't see how that's possible (it doesn't seem > like there is any code in the memory cache that returns that error). > > In fcc_initialize() (the equivalent function for the FILE cache) the > whole function has a krb5_cc_mutex_lock()/krb5_cc_mutex_unlock() around > the guts of the function, and the actual cache file is locked as well. > The FILE cache code can return KRB5_FCC_INTERNAL, but only if a system > call returns a set of specific error codes, which right now are: > a > case EINVAL: > case EEXIST: > case EFAULT: > case EBADF: > #ifdef EWOULDBLOCK > case EWOULDBLOCK: > #endif > ret = KRB5_FCC_INTERNAL; > > --Ken From aglo at umich.edu Thu Feb 20 19:41:14 2025 From: aglo at umich.edu (Olga Kornievskaia) Date: Thu, 20 Feb 2025 19:41:14 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: <87a5fae5-4933-430c-b8d2-ed62f0530386@mit.edu> References: <87a5fae5-4933-430c-b8d2-ed62f0530386@mit.edu> Message-ID: On Thu, Feb 20, 2025 at 6:25?PM Greg Hudson wrote: > > On 2/20/25 16:47, Olga Kornievskaia wrote: > > I haven't been able to find in documentation whether or not > > krb5_cc_initialize() is thread safe? From a practical point of view, I > > believe it is not. But a call to "krb5_is_thread_safe() returns true". > > krb5_is_thread_safe() just reports whether the library was compiled with > mutex calls. > > krb5_cc_initialize() is thread-safe to the extent that it shouldn't > exhibit memory errors if called in multiple threads with the same > krb5_ccache (and different krb5_contexts, as krb5_context is not a > thread-protected object). It is inherently not very concurrency-safe, > whether the concurrency is due to threads or processes, because at the > end of the call the cache is empty (no TGT) and therefore not usable to > obtain service tickets. (Well, "inherently" according to the > traditional semantics. Heimdal has adopted a behavior change to > krb5_cc_initialize() where initializing a cache handle isn't externally > visible until the first non-config credential is stored. We didn't go > that route.) > > We did add a way in 1.20 to atomically replace a credentials cache of > most types: krb5_cc_move() a MEMORY cache onto the target. This method > is used automatically by gss_store_cred()/gss_store_cred_into(), and by > krb5_get_init_creds_*() when an output cache is specified using > krb5_get_init_creds_opt_set_out_ccache(). The commit that added this > behavior is: > > https://github.com/krb5/krb5/commit/371f09d4bf4ca0c7ba15c5ef909bc35307ed9cc3 > > > In NFS's gssd implementation, while trying to refresh credentials, the > > code tries to store creds and as part of calls into > > krb5_cc_initialize() to initial credential cache (of either FILE or > > MEMORY type). We've observed that if two threads try to update the > > same cache and thus call into krb5_cc_initialize(), one succeeds and > > the other fails with "Internal credentials cache error". Is that > > unexpected? > > In this scenario the two threads a probably using separate krb5_ccache > handles to perform the initialize and store operations, so libkrb5 > locking the handles doesn't prevent the operations from trying to do > their work at the same time--the same as if two different processes > running two different programs were trying to initialize and store to > FILE caches with the same filename. I can see one order of > sub-operations which would generate that error message, right off the > bat: we start by unlink()ing the file and then opening it with O_EXCL. > If these operations are interleaved (T1 unlink, T2 unlink, T1 open, T2 > open), the second open() will return an EEXIST error, which is > translated to KRB5_FCC_INTERNAL. Does this sound like there might be a (solvable) problem in the krb5_cc_initiatize() that can guarantee that multiple threads can call it simultaneously (on as you noted their own pointer of the krb5_ccache structure that they have gotten from doing krb5_cc_resolve() and both threads would get a non-error return from the function? Or is that the caller's (like gssd) responsibility not call them concurrently? And I'm also wondering what about other krb5_* functions are they "safe" under concurrency or should gssd be doing any krb* api call under a lock? From kenh at cmf.nrl.navy.mil Thu Feb 20 20:52:48 2025 From: kenh at cmf.nrl.navy.mil (Ken Hornstein) Date: Thu, 20 Feb 2025 20:52:48 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: References: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> Message-ID: <202502210152.51L1qn3g003960@hedwig.cmf.nrl.navy.mil> >In my testing I've had gssd setup use the "default" ccache type which >is FILE. I haven't tried if setting it use_memory (switching to >MEMORY) works better. But regardless, gssd needs to do something >"better" for the case of FILE credential type and I'm trying to figure >out what that should be. Greg does bring up the larger meta-issue that you're apparantly trying to have two threads call krb5_cc_initiualize() on the same FILE credential cache; what, exactly, are you trying to accomplish there? --Ken From kenh at cmf.nrl.navy.mil Thu Feb 20 22:25:39 2025 From: kenh at cmf.nrl.navy.mil (Ken Hornstein) Date: Thu, 20 Feb 2025 22:25:39 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: References: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> <202502210152.51L1qn3g003960@hedwig.cmf.nrl.navy.mil> Message-ID: <202502210325.51L3PdlO004564@hedwig.cmf.nrl.navy.mil> >> Greg does bring up the larger meta-issue that you're apparantly trying >> to have two threads call krb5_cc_initiualize() on the same FILE >> credential cache; what, exactly, are you trying to accomplish there? > >NFS gssd service is multithreaded (has been for a while now). And at >some point we've allowed multiple upcalls for the same UID (leading to >the upcalls looking/working on the same credential cache) and thus the >problem that krb5_cc_initialize() is called by 2 threads. It was >assumed that kerberos libraries are "thread-safe". I think you're missing Greg's point; krb5_cc_initialize() wipes out the credential cache completely and makes it non-usable. That's what he meant by it being thread safe but not concurrency safe. If one upcall stored credentials another thread would wipe those out with a call to krb5_cc_initialize(). I'm unclear what exactly you expect to happen in this situation. --Ken From aglo at umich.edu Thu Feb 20 21:08:13 2025 From: aglo at umich.edu (Olga Kornievskaia) Date: Thu, 20 Feb 2025 21:08:13 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: <202502210152.51L1qn3g003960@hedwig.cmf.nrl.navy.mil> References: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> <202502210152.51L1qn3g003960@hedwig.cmf.nrl.navy.mil> Message-ID: On Thu, Feb 20, 2025 at 8:52?PM Ken Hornstein wrote: > > >In my testing I've had gssd setup use the "default" ccache type which > >is FILE. I haven't tried if setting it use_memory (switching to > >MEMORY) works better. But regardless, gssd needs to do something > >"better" for the case of FILE credential type and I'm trying to figure > >out what that should be. > > Greg does bring up the larger meta-issue that you're apparantly trying > to have two threads call krb5_cc_initiualize() on the same FILE > credential cache; what, exactly, are you trying to accomplish there? NFS gssd service is multithreaded (has been for a while now). And at some point we've allowed multiple upcalls for the same UID (leading to the upcalls looking/working on the same credential cache) and thus the problem that krb5_cc_initialize() is called by 2 threads. It was assumed that kerberos libraries are "thread-safe". > > --Ken From aglo at umich.edu Fri Feb 21 09:47:26 2025 From: aglo at umich.edu (Olga Kornievskaia) Date: Fri, 21 Feb 2025 09:47:26 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: <202502210325.51L3PdlO004564@hedwig.cmf.nrl.navy.mil> References: <202502202321.51KNLi7e002896@hedwig.cmf.nrl.navy.mil> <202502210152.51L1qn3g003960@hedwig.cmf.nrl.navy.mil> <202502210325.51L3PdlO004564@hedwig.cmf.nrl.navy.mil> Message-ID: On Thu, Feb 20, 2025 at 10:25?PM Ken Hornstein wrote: > > >> Greg does bring up the larger meta-issue that you're apparantly trying > >> to have two threads call krb5_cc_initiualize() on the same FILE > >> credential cache; what, exactly, are you trying to accomplish there? > > > >NFS gssd service is multithreaded (has been for a while now). And at > >some point we've allowed multiple upcalls for the same UID (leading to > >the upcalls looking/working on the same credential cache) and thus the > >problem that krb5_cc_initialize() is called by 2 threads. It was > >assumed that kerberos libraries are "thread-safe". > > I think you're missing Greg's point; krb5_cc_initialize() wipes out the > credential cache completely and makes it non-usable. That's what he > meant by it being thread safe but not concurrency safe. If one upcall > stored credentials another thread would wipe those out with a call to > krb5_cc_initialize(). I'm unclear what exactly you expect to happen > in this situation. Imagine if there are 2 parallel kinit's. It doesn't matter that the 2nd one will wipe out the creds, there will be a set of creds in the end. I expect to krb5_cc_intiialize() to not fail. Yes I can see that some conditions (such as memory allocation failure) can lead to one thread successfully completing while the other will fail. But what memory issues are not in play here. > > --Ken From ghudson at mit.edu Sat Feb 22 03:04:24 2025 From: ghudson at mit.edu (Greg Hudson) Date: Sat, 22 Feb 2025 03:04:24 -0500 Subject: is krb5_cc_initialize() thread safe In-Reply-To: References: <87a5fae5-4933-430c-b8d2-ed62f0530386@mit.edu> Message-ID: <9d7a1726-4f53-4d2f-9b4e-5b859f9d1a82@mit.edu> On 2/20/25 19:41, Olga Kornievskaia wrote: > Does this sound like there might be a (solvable) problem in the > krb5_cc_initiatize() that can guarantee that multiple threads can call > it simultaneously (on as you noted their own pointer of the > krb5_ccache structure that they have gotten from doing > krb5_cc_resolve() and both threads would get a non-error return from > the function? Maybe, but (1) it would be awkward, and (2) there are better solutions on the gssd end. Elaborating on (1): POSIX allows you to advisory-lock a file but not a pathname. krb5_cc_initialize() wants to replace the cache file, not truncate it, so that it doesn't invalidate iterations over the previous generation of the cache. Without adding a second persistent filesystem artifact to hang advisory locks onto, you can't lock around a file replacement. So we would be trying to work around races like: T1: unlink T1: open-exclusive (success) T1: lock new file T2: unlink T1: writes header (to a no-longer-linked file), unlocks, and returns T1 caller: thinks initialization is done, but the file doesn't exist! T2: locks file, writes header, unlocks file, and returns Elaborating on (2): gssd does krb5_get_init_creds_keytab() to get a TGT into memory, then krb5_cc_resolve(), krb5_cc_initialize(), and krb5_cc_store_cred(), similar to how kinit worked about fifteen years ago. According to the historical semantics of the ccache interface, there is inherently a window between the last two steps of this sequence where the cache exists but is useless because it contains no TGT. The sequence used by kinit -k since krb5-1.8 is: krb5_cc_resolve(), krb5_get_init_creds_opt_set_out_ccache(), krb5_get_init_creds_keytab(). (No krb5_cc_initialize() or krb5_cc_store_cred() calls; that is done by krb5_get_init_creds_keytab().) In krb5-1.20 or later, that will automatically use the atomic cache replacement mechanism I described in my first reply. This method also correctly saves some FAST-related cache config entries that otherwise get lost. It might also be possible to use gss_acquire_cred_from() and client keytab support to get rid of the whole process of explicitly acquiring credentials. But that would be a more expansive change. > And I'm also wondering what about other krb5_* functions are they > "safe" under concurrency or should gssd be doing any krb* api call > under a lock? I don't think you need to embark on a project of adding mutex locking around libkrb5 API calls. This is a very specific concurrency issue arising from the historic semantics of the libkrb5 credential cache API.