Multithreading the KDC server - a design

Wed Apr 6 12:23:10 EDT 2005

On Apr 6, 2005, at 00:32, Rahul Srinivas wrote:
> The reason I am talking about version 1.3.5 is that at Novell we are
> working on the 1.3.5 codebase to integrate it with eDirectory. The
> development work started before the release of version 1.4. So, we have
> stuck to version 1.3.5. But, our goal is to port the changes to 
> version 1.4 before submitting the patch.

Okay.

> The thread interface used in the KDC would be an extension of the 
> interface available in version 1.4. The following are the additional 
> APIs that would be used.
> 1. Thread creation - pthread_create()
> 2. Recursive mutex - code would be added to KDC for implementing this. 
> (Or
>    would it be better to add to the k5-threads.h ?)
> 3. The monitor used for managing the request queue can be implemented
>    using mutexes and condition variables. The thread interface in 
> version
>    1.4 does not include condition variable operations. This would be 
> added
>    to the KDC code (Or should the thread interface be extended).

I am unfamiliar enough with the practical differences between mutexes 
and monitors in real-world apps to be sure, but I suspect monitors 
might be a good addition to the thread shim layer.  Thread creation, 
possibly, also.  Recursive mutexes, I think the jury's still out. :-)

> The following is how recursive mutexes would be implemented.
> 1. If PTHREAD_MUTEX_RECURSIVE_NP is defined (it is defined in 
> pthread.h on
>    linux), the pthread_mutex_t member of the k5_mutex_t would be
>    re-initialized to be a recursive mutex. Since pthread.h is present, 
> I
>    am assuming that mutex.os.p exists and is of type pthread_mutex_t.

PTHREAD_MUTEX_RECURSIVE appears to be what's used on Solaris and 
NetBSD, and what's in POSIX.

> In the prototype that I have implemented, the main thread waits for 
> all service threads to exit before exiting itself. So, the problem 
> arising out of a database query hanging does exist. It has to be 
> addressed.

It may be that it's not terribly unsafe to simply exit.  It depends 
what the database code is doing.  If it's just fetching data, it 
probably doesn't matter.

> 'gethostbyname()' is used in the function krb5_os_localaddr() (in
> lib/krb5/os/localaddr.c).

Only on Windows (where, I believe, it is thread-safe).  FWIW, other 
uses of gethostbyname should all be replaced by getaddrinfo.

> In the KDC code, such atomic operations on global variables would be 
> required in a few places if the server is made multithreaded. For 
> example, in accept_tcp_connection(), when tcp_data_counter > 
> max_tcp_data_connections, the following two actions have to be 
> performed atomically.
> 1. looking up 'connections' for the oldest connection (inside
>    accept_tcp_connection())
> 2. deleting it from 'connections' (inside kill_tcp_connection())
> Otherwise, if multiple threads choose the same connection for 
> deletion, one of them could end up with an invalid handler when the 
> other deletes it.
>
>   It is possible to do the same using non-recursive mutexes, but it 
> would need rewriting some of the code to group multiple accesses to 
> global variables into a single function. But I feel that for an 
> initial implementation, adding lock and unlock calls to existing code 
> would be simpler.

Mostly I prefer the rewriting, unless you can simply require that 
certain functions are always called with certain mutexes already held; 
I've done both in various places.  (E.g., kill_tcp_connection may 
require that a mutex already be held, or perhaps it locks a mutex and 
calls a second function which new function delete_oldest_tcp_connection 
also calls after locking the same mutex and scanning the timestamps.)  
I've even got some debugging code which (in some configurations at 
least) will abort if the mutex is not already locked.

If you really prefer to do it with recursive mutexes, I think I'd 
prefer that that change (the recursive mutex support) be localized to 
the KDC in the initial implementation, and then removed when the code 
is suitably rewritten, even though that's uglier in the short term.  At 
least, that's the approach I like better at the moment, since I'm still 
not convinced we need recursive mutexes in the long run.

Ken