[krbdev.mit.edu #2426] SIGPIPE on AIX when sending to non-listening KDC TCP port

Bill Dodd via RT rt-comment at krbdev.mit.edu
Thu Mar 18 16:01:35 EST 2004



MIT build: 1.3.2
Client OS: AIX 4.3.3, 5.1 and 5.2

Problem description:

On AIX, if krb5_sendto_kdc() attempts to use TCP to send a request to
a KDC that is not listening on the TCP port, the process will get a
SIGPIPE and (in that absence of an explicit SIGPIPE handler) the
process will terminate.

A relatively easy way to reproduce the problem:

Have a KDC configured to listen on udp only
Configure a 1.3.2 AIX kerberos client
On the AIX client, set udp_preference_limit = 1 in krb5.conf
Run kinit on the client
Should see kinit terminate without actually doing anything

This is timing related, so it's not guaranteed to work, depending on
perhaps, machine performance characteristics and network latency. But
it always worked for me in my tests.

On to the problem:

The problem is that on AIX, with non-blocking tcp streams, after you
do the connect() to a non-listening host/port and then the select(),
there is a window of time in which the select() will indicate that the
fd is available for writing. In this case, service_tcp_fd() goes ahead
and attempts to write to the socket. This results in the SIGPIPE.

I'm not sure why the select() on AIX is indicating that the fd is
available for writing. And in fact, if a longer period of time elapses
between the connect() and the select(), the select() will indicate
that the fd is available for reading. This makes a bit more sense -
there is an error available for reading. Although I just noticed that
the comment in service_tcp_fd() just before the attempt to write says:
"UNIX sets can_write if failed." So, maybe AIX is not weird in this
case after all.

To fix, I did what Stevens does in his connect_nonb() in section 15.4
of UNP, 2nd Ed. If the select() indicates that the fd is available for
writing, call getsockopt(..., SO_ERROR, ...) to see if there is a
pending error.

My patch is appended below. 

Thanks,
-bill

*** sendto_kdc.c.orig	Fri Dec  5 19:30:42 2003
--- sendto_kdc.c	Thu Mar 18 14:48:25 2004
***************
*** 734,739 ****
  	 * Connect finished -- but did it succeed or fail?
  	 * UNIX sets can_write if failed.
! 	 * Try writing, I guess, and find out.
  	 */
  	conn->state = WRITING;
  	goto try_writing;
--- 734,756 ----
  	 * Connect finished -- but did it succeed or fail?
  	 * UNIX sets can_write if failed.
! 	 * Call getsockopt to see if error pending.
  	 */
+         {
+             int sockerr = 0;
+             socklen_t sockerrlen = sizeof(sockerr);
+             e = getsockopt(conn->fd, SOL_SOCKET, SO_ERROR,
+                            &sockerr, &sockerrlen);
+             if (e) {
+ 		e = SOCKET_ERRNO;
+ 		dprint("getsockopt(SO_ERROR) on write fd failed: %m\n", e);
+ 		goto kill_conn;
+             }
+             if (sockerr) {
+                 e = sockerr;
+ 		dprint("getsockopt(SO_ERROR) on write fd returned error: %m\n",
+                        e);
+ 		goto kill_conn;
+             }
+         }
  	conn->state = WRITING;
  	goto try_writing;


More information about the krb5-bugs mailing list