krb524 stops working (MIT 1.5.4 + 2007-005/6 patches + fakeka patch)
John Tang Boyland
boyland at cs.uwm.edu
Fri Nov 9 17:30:57 EST 2007
] On Nov 9, 2007, at 11:14, John Tang Boyland wrote:
] > However, every once in a while, the krb524d stops responding to
] > requests. "ps augx" says:
] >
] > USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND
] > root 12025 1.2 63.5184632156496 ? S Oct 15 382:56 /
] > opt/local/sbin/krb524d -m
] >
] > Last time this happened (apparently October 15th), I killed it and
] > started it up again, and it worked just fine. But now it stopped
] > working again.
]
] That's not something we've seen before... when it's in this state,
] could you try running strace on the process (for Linux, truss for
] Solaris, etc) while you're sending requests it's not responding to?
It's Solaris. I tried "truss -p 12025" and nothing happens when I run
the client (an old version of aklog) that tries to contact it.
aklog times out, with error:
...
Kerberos error code returned by get_cred: -1765328228
aklog: Couldn't get cs.uwm.edu AFS tickets:
aklog: Cannot contact any KDC for requested realm while getting AFS tickets
On the other hand, if I "truss -p" the fakeka process, I gets lots of
appropriate debugging information. (Here I tested using "klog".)
] If you've built it with debug info, attaching the process under gdb
] to get a stack trace may help too, if strace doesn't show any
] activity.
It wasn't built with debugging, but did at least show:
...
Symbols already loaded for /usr/lib/libthread.so.1
Symbols already loaded for /usr/lib/nss_files.so.1
Symbols already loaded for /opt/local/lib/krb5/plugins/kdb/db2.so
0xfef9b75c in __lwp_sema_wait ()
(gdb) where
#0 0xfef9b75c in __lwp_sema_wait ()
#1 0xfee89830 in _park ()
#2 0xfee89508 in _swtch ()
#3 0xfee8d960 in _reap_wait ()
#4 0xfee8d6b8 in _reaper ()
This is an ancient version of gdb, I'm afraid.
] Monitoring traffic to krb524d with tcpdump (ideally, recording the
] full packets) might be handy if you catch it at some point while it's
] growing massively like this. (Though on the off chance that someone
] has come up with a krb524d deathgram packet, please report the info
] to krbcore-security at mit so we could try to get a fix out before the
] means of killing it becomes too widespread.)
I'll kill the process now and restart it.
I'll see if I can catch krb524d when it's on a growth spurt.
(and perhaps it is gradual.)
...
OK, it's sane again now that I restarted it.
USER PID %CPU %MEM SZ RSS TT S START TIME COMMAND
root 20757 0.0 0.8 3704 1792 ? S 16:25:07 0:00 /opt/local/sbin/krb524d -m
More in a week?
Thanks,
John
More information about the Kerberos
mailing list