kinit locking up

Ken Raeburn raeburn at MIT.EDU
Thu Dec 15 03:04:07 EST 2005


On Dec 14, 2005, at 22:39, Russ Allbery wrote:
> Below is the interesting bit.  valgrind at least thinks that the  
> mutexes
> aren't fully initialized, it looks like, before they're passed into  
> the
> pthread functions.  Note that I don't get these same errors when I run
> valgrind against the kinit in unstable, so I'm curious about what's
> different in your situation.
>
> One thing that I did notice was a bunch of SASL calls earlier on.   
> I think
> the invalid reads there are probably just the standard ld.so noise  
> that
> doesn't appear to mean anything, but their presence indicates that  
> you're
> using some nsswitch module (I'm guessing) that uses SASL.  Are you  
> using
> LDAP to do UID lookups, by any chance?
>
> I don't know why that would affect things, but maybe it is.

Aha!  I think you may have put your finger on it, Russ.

Part of our approach to dealing with threads has been, when possible,  
to give the application writer the option of not linking in the  
pthread library, instead using weak references at run time to see if  
the library has been loaded.  We want programs to be able to  
dynamically load and unload the Kerberos and GSSAPI libraries without  
ill effects, and the general wisdom seems to be that dynamically  
loading a pthread library into a single-threaded program can do Bad  
Things, like cause different versions of malloc or other functions to  
be seen, when they may not be fully compatible with the versions  
already in use by the program.

So now we have another library being loaded somewhere along the way  
which causes the pthread library to be loaded (libnss_ldap ->  
libldap_r -> libpthread, on my system), and the Kerberos library is  
calling pthread_mutex_lock on a structure which it hasn't  
initialized, when the LDAP code is loaded.

The strange bit is, in 1.4.3, the Kerberos code should be checking  
for the thread support once, and saving away a flag; if it wasn't  
loaded at mutex initialization time, we shouldn't be calling  
pthread_mutex_lock.  If we call pthread_mutex_lock, we should've  
decided earlier to call pthread_mutex_init.  So I should do some  
poking around and see if I can figure out why we still seem to be  
behaving differently in the two cases.

In any case, I do think libnss_ldap may have a bug -- I don't think  
libpthread should be getting pulled into a single-threaded program  
dynamically.  Though, if it's guaranteed to be safe with glibc, and  
remain safe in future releases, we could consider a Linux- or glibc- 
specific change to punt the weak-reference stuff and always pull in  
the pthread library ourselves...

Ken


More information about the Kerberos mailing list