Assertion failed for krb5kdc

Thu Oct 8 04:05:15 EDT 2009

On Oct 8, 2009, at 02:19, Mohammad, Meraj wrote:
> Kerberos 5 release 1.7". I am always getting assertion failure and
> program is aborted.
> I am not getting a stack trace and i have no idea, how to get stack
> trace.

Do you know how to use gdb?
Something like this sequence of commands should work:

At the shell prompt: gdb /path/to/krb5kdc
At the (gdb) prompt: run -n           # <- "-n" tells it not to fork  
into background

After it hits the assertion failure you should get another "(gdb)"  
prompt.  Run the gdb command "bt" (for "backtrace") and send the  
results.

If it doesn't hit the assertion failure when run this way, try "set  
args" (that tells gdb to forget about the "-n" command line argument,  
as "run" without any extra arguments will re-run the program with the  
arguments you used the time before) and then "run" and see if that  
triggers the problem.

Alternatively, if you got a file named "core" in the directory where  
you started the KDC, then "gdb /path/to/krb5kdc /path/to/core" will  
let you examine the memory image of the dead process, so you can run  
the "bt" command without having to start up a new KDC process under  
the debugger.

If you don't have gdb installed and don't know how to get it  
installed, you can try a debugger shipped with Solaris, but I don't  
remember offhand which Solaris 9 might ship with...

Based on the message, my guess is a bug in the library initialization  
code, perhaps triggered by a broken version of pthread_once.  Our  
library jumps through some hoops to try to work with both single- 
threaded and multi-threaded programs (more precisely, programs built  
as multi-threaded programs, which get the thread support library  
linked in, and programs which are built as single-threaded programs,  
which don't) without requiring two different versions of the library.   
However, older version of Solaris included some broken dummy versions  
of thread-support functions which sometimes made it hard to figure out  
which mode was in use.  It may be that some of the tests written to  
detect the Solaris stub versions got broken in 1.7 (or even earlier);  
since MIT's Kerberos group has been using Solaris 10 for testing for  
quite a while, this could've been overlooked for a long time.  (That's  
the sort of thing beta testing is supposed to help uncover, isn't  
it?)  If my guess is right, it probably wouldn't even be hard to fix,  
but the code has grown rather baroque and may take some time to  
understand first...

Ken