Error while compiling krb 1.5

Fri Jul 7 02:56:00 EDT 2006

> From: Sam Hartman <hartmans at mit.edu>
> To: Marcus Watts <mdw at umich.edu>
> Cc: krbdev at mit.edu
> Subject: Re: Error while compiling krb 1.5
> References: <200607062217.SAA09149 at quince.ifs.umich.edu>
> Date: Thu, 06 Jul 2006 18:37:40 -0400
> 
> >>>>> "Marcus" == Marcus Watts <mdw at umich.edu> writes:
> 
>     Marcus> 	/3/ no "--enable-dynamic".  The user does not want .so
>     Marcus> files.  But he must still want/need module support else
>     Marcus> why 1.5?
> 
>     Marcus> 	So, build libkrb5_pic.a, and link modules using
>     Marcus> "-lkrb5_pic".
> 
>     Marcus> 	You're probably loading in redundant copies of
>     Marcus> routines at this point - but if they can't see each other
>     Marcus> that "should" be mostly harmless, except perhaps to the
>     Marcus> sanity of anybody unlucky enough to be looking for bugs.
> 
> No, mutex state and other globals in k5support need to be shared.

I never said the /3/ case of "two loaded library instances" wasn't at
least as ungainly as Dr. Doolittle's Pushme-Pullyou.  It *might* work
anyways for kdb replacement modules  - but it is ugly in the more
abstract sense.  I prefer /2/ over this, or /1/.

I believe your sharing problem is mostly not an issue if the two loaded
library instances can't "see" each other, because their names aren't
exported and visible to each other, including their mutex state and
globals.

For instance, there's a kt_type list in lib/krb5/keytab/ktbase.c that's
protected by a "global" mutex.  If you're loading in two separate
copies of those libraries, each with their own instance of a kt_type
list, there's no reason for them to share a mutex - and at that point
every reason for the two instances *not* to see each other.
You don't need to share state if you don't share state.

Where this gets a bit more interesting is when you start looking at
things like keytabs, credentials caches, and error tables, because
there you do expect to see file contention or application visible
sharing.  Even there it's not always necessary.  The credentials cache,
for instance, also uses file locking to handle sharing from other
processes, and will probably handle inter-thread locking as well.
Using the filename and reference counting to handle sharing is kinda
bogus anyways (what if somebody renames or unlinks a live credentials
cache?)  The error table stuff, um, when you start trying to link
applications against k5 & afs, let's just say you learn to enjoy pain.

I don't think option /3/ is actually a particularly good idea
for the kdc & kdb.  But I think the issues it raises are worth
exploring anyways, because this is likely to be something people
building perl modules (for instance) might want to do.
For an extreme case, consider perl module (a) linked against
heimdal, vs. perl module (b) linked against mit k5.
Or, there's library version issues.  Shouldn't be, but...

It's funny you should happen to mention mutex state in k5support.
We've got one project that wants to stay with kerberos 1.3.5 because in
1.4.3, that "sharing" code you mention in k5support causes a
very interesting hang.  This is for a perl module which actually does
something useful, but here's a simple-minded test program that
behaves the same:  [ I did this with glibc 2.3.2 and mit k5 1.4.3 built
*without* --with-system-et ]
	cc -o m1 /afs/umich.edu/user/m/d/mdw/source/phang/src/m1.c -ldl
	cc -shared -o t9.so /afs/umich.edu/user/m/d/mdw/source/phang/src/t9.c -Ipath-to-your-kerberos-includes -Lpath-to-your-kerberos-libs -Wl,-rpath,path-to-your-kerberos-libs -lcom_err -lpthread
(you can omit -I,-L,-Wl if it's part of the system.)
	./m1 -n subr ./t9.so 
output should be:
	About to create a thread
	hi there!  I am thread.
	wait for the thread to die.
	the thread returned 19
	Done with threading...
<hangs here>
/afs/umich.edu/user/m/d/mdw/source/phang/src/t8.c
in place of t9.c above will abort - the unknown error buffer
key isn't allocated until the first error table is registered.

The t9.c case is much more interesting however - turns out it hangs in
the k5support code trying to invoke the pthread manager which no longer
exists.  The problem seems to be that the atexit code installed by
libpthread gets run before the .fini function contained in k5support -
and the pthread library calls don't handle that.  Easy enough bug to
fix - in glibc.  Or we could build perl with -Dusethreads, and dodge
the bullet.  It would be nice to find a solution that doesn't involve
freaking out our production people however.

					-Marcus