multithreaded access to the same ccache

Sorin Manolache sorinm at gmail.com
Tue Jun 16 19:33:33 EDT 2015


Hello,

I'm using GSSAPI + Kerberos in a multithreaded application.

I'm creating a ccache of type memory and pass its name to GSSAPI 
(gss_krb5_ccache_name).

I'm seeing some segfaults in my application. After a lot of debugging 
I've come to the following hypothesis:

Two threads execute gss_acquire_cred_with_password in parallel. 
gss_acquire_cred_with_password will create a krb5_context in each 
thread. Both contexts will have the same default ccache name, picked 
from the thread-local variable set by gss_krb5_ccache_name.

As the ccache is empty, both threads will make a KDC request in order to 
fetch a credential.

Let us consider that the first thread finishes first and proceeds with a 
call to gss_export_cred. While the first thread executes in 
gss_export_cred, the second thread got its cred from the KDC and wants 
to insert it in the shared ccache.

gss_export_cred will run a loop on the ccache (json_ccache_contents in 
lib/gssapi/krb5/export_cred.c). The loop is

krb5_cc_start_seq_get(context, ccache, &cursor);
while ((ret = krb5_cc_next_cred(...)) == 0) {
     ...
}
krb5_cc_end_seq_get(context, ccache, &cursor);

At the same time, the second thread executes init_creds_step_reply 
(lib/krb5/krb/get_in_tkt.c) which in turn executes 
krb5_cc_initialize(context, out_ccache, ...). The latter destroys the 
contents of the ccache. This may cause a segfault in the first thread, 
as there's no mutual exclusion between the cache contents destruction 
and the cache iteration.

What I'm observing in core-files is that 'cursor' in the first thread 
has a value that is different from any ccache->data->link, 
ccache->data->link->next, ccache->data->link->next->next, etc. My only 
explanation is that the cache contents was destroyed between the 
krb5_cc_start_seq_get and the krb5_cc_next_cred calls.

I've changed the code of kerberos and I've placed krb5_cc_lock and 
krb5_cc_unlock before and after the loop in json_ccache_contents in 
export_cred.c. I have not tested sufficiently but it seems that the 
segfault does not occur anymore in json_ccache_contents. However I get 
another segfault in another unlocked ccache traversal, namely in 
acquire_init_cred->scan_ccache.

Is this hypothesis correct? Is there something I could do at 
application-level in order to avoid this situation? I'm thinking of 
either protecting all calls to GSSAPI/Kerberos by mutexes or using 
thread-level ccaches, though both approaches would decrease the performance.

Regards,
Sorin




More information about the krbdev mailing list