[kerberos-discuss] LDAP & Kerberos interaction & SIGPIPE
Tom Yu
tlyu at MIT.EDU
Fri Sep 25 12:50:38 EDT 2009
Peter Shoults <Peter.Shoults at Sun.COM> writes:
> I am sending this out again as the customer has tested a fix I provided
> and determined that it does resolve their issue. I would like to move
> forward with a fix, and the one I am using now is the one mentioned
> below with the addition of signal(). If I can get some comments, I
> would appreciate it. Otherwise - I guess I will just proceed to put
> this into Solaris code.
>
> Pete
Is there a reason to not use sigaction() here if possible? That
should make things more portable than calling signal().
> On 09/16/09 10:48, Peter Shoults wrote:
>> Hi,
>>
>> Customer has brought forward an issue they were having with Kerberos and
>> LDAP, where LDAP is being used to store the database information for
>> Kerberos. The issue is that if the LDAP server is restarted for any
>> reason, then Kerberos does not automatically resync back with the LDAP
>> server when the LDAP server is back up and running. Specifically, one
>> can run and login into kadmin, but any commands that are run will fail
>> with the error:
>>
>> "Communication failure with server while retrieving list."
>>
>> It turns out if the user exits from kadmin and logs back in a second
>> time, then the command do work fine.
>>
>> I have determined that the cause of this problem is that when the LDAP
>> server is restarted, all the connections we have on port 636 to the LDAP
>> server go into a CLOSE_WAIT/FIN_WAIT_2 state. When we log into kadmin,
>> we attempt to contact the LDAP server on these connections, and we
>> received SIGPIPE in response to our writes. Here is a snippet from truss:
>>
>> 3200/1: 57.2401 write(14, 0x0010B810, 23)
>> Err#32 EPIPE
>> 3200/1: 150301\012941A 60F Y P87A7BE9318B6
>> c8C |0F v
>> 3200/1: 57.2404 Received signal #13, SIGPIPE [caught]
>>
>> This is fine - the sig_pipe handler is invoked and we do print out the
>> syslog message. However, we never reset the signal disposition for
>> SIGPIPE. kadmind process immediately proceeds to try the next
>> connection to the LDAP server, and again gets SIGPIPE. This time
>> though, the default handler is invoked, which terminates kadmind. At
>> this point, SMF realizes kadmind has died and restarts it, which
>> re-establishes all our connections to the LDAP server and that explains
>> why a subsequent login to kadmin will work.
>>
>> I have two questions about this. The first why do we have a handler for
>> SIGPIPE in the kadmin code, unlike the krb5kdc code, which sets SIGPIPE
>> disposition to SIG_IGNORE. This handler in the kadmin code has not
>> changed in a long long time. I tested setting SIGPIPE to SIG_IGN and
>> this does allow a user to enter commands into kadmin after LDAP server
>> restarts and run commands without issue.
>>
>> Assuming we have the SIGPIPE handler specifically to output the syslog
>> message, then I propose that we have in the handler a resetting of the
>> signal disposition to sig_pipe. I have also tested this fix and
>> verified that this also resolves the problem and allows the user to
>> enter kadmin commands after LDAP server restarts. Here is my change:
>>
>> file modified is ovsec_kadmd.c
>>
>> void
>> sig_pipe(int unused)
>> {
>> + signal(SIGPIPE, sig_pipe);
>> krb5_klog_syslog(LOG_NOTICE, gettext("Warning: Received a SIGPIPE; "
>> "probably a client aborted. Continuing."));
>> }
>>
>>
>> Pete
>>
>>
>
> _______________________________________________
> kerberos-discuss mailing list
> kerberos-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/kerberos-discuss
More information about the krbdev
mailing list