issue with MIT KDC and LDAP DS

Sat May 23 17:45:21 EDT 2009

(Apologies for the length of this message... as the saying goes, I  
don't have time to write a shorter one.)

On May 23, 2009, at 10:17, Simo Sorce wrote:
>>  Several ideas that have been kicked around involve
>> multiple threads or processes, and would allow, say, the LDAP
>> database
>> implementation to manage its own parallelism and background tasks
>> without requiring hooks in the main KDC code.  If that sort of big
>> restructuring project can be done, great!  I'd love to see it happen.
>
> What need to happen to have it done ?

Some requirements and design discussions, first. :-)  Several ideas  
have been kicked around, including:
  * multithreaded KDC (e.g., http://mailman.mit.edu/pipermail/krbdev/2005-March/003237.html 
; Novell has donated some code, but changing single-threaded security- 
sensitive code to be multithreaded makes some people nervous)
  * multiprocess KDC split by functionality (e.g., main KDC process  
talks to a database-access process, each made smaller and easier to  
analyze, and independently we can assess whether each can safely be  
made multithreaded or should use async APIs or whatever)
  * multiprocess KDC running several independent instances in parallel  
(relatively simple and quick to code: fork after opening sockets,  
(re)open database connections after forking, let multiple processes  
all grab for any available network traffic detected; this leads to  
contention and wasted cycles, but only when more than one process is  
idle; it also may not balance load well in certain circumstances)
  * change DAL to support issuing query without blocking, getting  
result later via callback
  * make database back ends use async APIs for fetching data
  * break KDC functionality into one or more library modules that can  
be invoked in new ways (e.g., inetd-launched KDC) or integrated into  
another process
  * rewrite KDC as well as library to be friendlier to techniques such  
as Apple's Grand Central and similar technologies, but without being  
platform-dependent (okay, rewriting the KDC like this hasn't actually  
gotten a lot of discussion that I've seen, but I expect it would  
eventually come up)

Some of these changes would be relatively easy to implement by  
themselves, in one shot or in a few steps, but they can interact in  
bad ways, so we should have a plan for where we're going.

So far as I've seen, no one has pulled together a requirements list  
for restructuring the KDC.  I could throw out some mostly obvious  
ideas, but as of when I left the Consortium, I think we'd talked to  
various sponsors and potential sponsors about some ideas but hadn't  
really assessed which were the most crucial, and hadn't talked to  
everyone who might have an interest.  So ideas need to be collected  
and evaluated.  It also would help to better understand the needs of  
the various different target environments where the KDC would be used  
-- normal standalone realm KDCs; Samba and IPA environments;  
eDirectory environments.  Mobile/embedded KDCs?  (Maybe a KDC running  
in one of those little Linux server wall-warts, just configure it and  
plug it in in a locked closet.  On a cell phone or PDA?)  Cable system  
head ends?  Tamper-proof cards in computers deployed in insecure venues?

This list is a bit haphazard, but maybe it can help somebody get  
started.

  * more efficient use of multicore systems under heavy load (not the  
top priority, as we rarely hear that CPU time is an issue)
  * don't block one request because processing an earlier request is  
blocking on some external factor; answer KDC requests out of order
    * DB (LDAP) access is just one example
    * PKCROSS would involve having the KDC wait on additional external  
traffic
    * principal alias processing rules could theoretically do DNSSec  
queries
    * recognize if a request duplicates one we already received but  
not answered, and avoid duplicating work and traffic when possible  
(unimportant now but more important if LDAP/PKCROSS/whatever delays  
increase the likelihood of such events; I think Novell's patch deals  
with this by extending the lookaside cache)
    * theoretically, any plugin hook could block, including preauth  
and authz-data mechanisms
    * network reconfiguration could result in a socket being shut down  
while we're still preparing to send out a response through it, if  
we're not careful
  * Should we cache DB results for performance?  If so, at the KDC  
level or buried in the DB layer and transparent to the KDC main code?
  * minimal or zero CPU use in idle state (e.g., no frequent polling  
at application level to make sure LDAP connection is still up, though  
TCP keepalives may be okay, and one-shot timers such as for shutting  
down extra worker processes are probably okay)
  * fast response to clients when possible
  * survive various DoS attack cases reasonably
    * don't crash; be available shortly after attack stops, if not  
immediately after or even during attack
    * reduce backscatter from forged-source traffic? TCP helps
    * don't chew up resources (memory, process slots) without bound?
  * programmable configuration of some functionality
    * current plugins (make list)
    * extended 1.7 DAL operations and flags (make list), moving  
towards general "infrastructure" interface
      * still support various extensions without requiring one massive  
integrated "database" back end?
    * pluggable logging? e.g., log to database for later intrusion- 
detection analysis; if serious auditing requirements exist, such that  
the KDC is not permitted to issue unlogged credentials, errors in  
logging may also need to be able to force errors up the call chain
    * scriptable extensions (e.g., for describing rules for principal  
alias processing or PKINIT mapping)? easier for sysadmins who are not  
experienced C programmers; can one script define multiple plugin  
functions that share data?
    * notify on startup/shutdown hooks -- advertise via mDNS, DDNS,  
anycast routing, whatever
    * may want a way to feed data into a plugin through external means  
(e.g., updated file with alias mappings, or new DNS zone file to parse  
and derive mappings from); how to get its attention vs requiring it to  
call stat() a lot?
    * what else?
  * code clarity, cleanliness, documentation (no, really),  
maintainability
  * security - probably most important; worth jumping through some  
hoops for making buffer overruns impossible, sandboxing, whatever
  * ease of security analysis - could be helped by breaking code into  
well-separated modules, and not sharing address space; also by using  
privilege separation, internal interfaces well suited for static  
analysis tools, and other security practices
  * don't require DB accessibility at startup, per Will F? (but, if  
db2 database is missing, maybe we want to error out at startup?  maybe  
ldap server unavailability should be a "temp fail" condition and we  
continue?  server available but realm data non-existent would probably  
be hard fail.)
  * support multiple realms in one KDC process, possibly with  
different configurations (e.g., LDAP vs DB2, different port numbers  
advertised); OR, explicitly don't support it any more and reject the  
command-line options
  * flexibility in network setup
    * current code listens on all local addresses (separately or via  
PKTINFO options), reconfiguring itself at run time if needed when  
addresses are added or removed
    * inetd "wait"-type service for UDP or TCP
    * forking for multiprocessing would mess with the auto- 
reconfiguration
    * have had request to specify subset of local IP addresses
    * KDC, kadmin, kpasswd services should all use similar mechanism/ 
code and parallel config file/command line specs (if not the same  
ones), as much as practical (e.g., kadmin is IPv4-only for now in MIT  
code but that may be fixed later so options should allow for that)
    * need we worry about kprop at the same time? probably not
  * programming environment: Is it unreasonable to consider careful  
use of C++ or Python for some pieces of the code, or are we stuck with  
C and the dangers and annoyances that come with it?  (For krb5 library  
code, C++ still seems like a bad idea, but for non-library code like  
the server programs, some of the issues don't apply.)
  * integrate into another server process?
    * would probably want to supply plugins programmatically instead  
of requiring the loading of external shared objects
    * would probably want to control options programmatically rather  
than having to write out a config file
    * embedded KDC might work this way, if not dealing with embedded  
Linux or *BSD, or even then in an effort to reduce process count and  
memory footprint
    * might need to hand main-event-loop control over to other code
    * has security implications that should be well documented, but  
don't automatically make it a bad idea
    * "KDC library" idea probably shouldn't involve a public/stable  
API until we can shake out bugs
  * record/report statistics for monitoring? multiprocess approaches  
make this trickier, coordination needed
  * any hooks desired for intrusion detection?
  * fast and easy restart, in case of problems/bugs or changes in  
configuration options only checked at startup; this implies clean and  
quick shutdown of network so we can rebind the ports right away
  * peak load requirements? e.g., at start of work day or after power  
outage

We've talked a little, and I've talked with Don Davis a little, but I  
don't know in great detail what your requirements (and nice-to-have  
list) look like. Nor am I all that sure about what other environments  
with unusual requirements (resource, performance, security, legal) the  
KDC might get used in.  That data should be gathered.

Once we can evaluate requirements, maybe some of the possible  
approaches can be selected, or ruled out.  E.g., my simple fork-after- 
startup idea doesn't keep the load down as well under light usage,  
doesn't minimize memory use or achieve privilege separation, and  
doesn't allow for detecting that multiple identical requests are being  
processed at once and could be optimized.  So in the long term it's  
probably not the way to go.  But it could be a short path to taking  
advantage of multiple CPU cores, if that's useful now.  And if we move  
to a clean "KDC library" interface, the fork-after-startup bit would  
probably have become just a couple dozen lines of code that would be  
easy to rip out when more efficient multi-process code was available.

Many of these items will probably become minor implementation notes  
once a direction is chosen.  But we need to keep some of them in mind  
so we don't accidentally choose a direction that makes some of them  
very hard.

Tom has also been talking about redesigning the database and its  
interface, for various reasons.  The existing arrangement operates in  
old DBM mode -- here's a key and a blob of data, and the KDC code  
knows how to parse and assemble the data.  That doesn't let anyone run  
independent tools over the database very effectively, nor can we ask  
for only the subset of data we actually need.  The LDAP back end makes  
it a bit easier, because it knows how to store some of the KDC data in  
special fields, but a lot of that is buried in the LDAP back end code  
that just mirrors some of the code elsewhere.  It would be nice to  
take that further, and use, for example, sqlite3 or mysql (or one of  
these under odbc) as a database interface, with multiple tables and  
all that fun stuff, and only retrieve the data we really need, but  
that means having a more clearly defined and stable set of fields.  It  
would also provide an opportunity to change how some things are  
stored, so that for example hardware preauth schemes to use with a  
particular principal aren't flagged by the existence of a secondary  
principal in the database, with no keys and no purpose other than to  
exist as a flag.  And Luke's extension of the DAL to add things that  
are not necessarily Kerberos-database items makes it clear that our  
current plugin hooks are not sufficient for some sites' needs.

It's largely a separate project from changing the KDC startup and  
network handling and parallelism and such, but again, if assumptions  
are made on one side and violated on the other, we could hurt  
ourselves, especially in such areas as forking or multithreading, or  
in this case, not coordinating changes in the DAL interface.

>> I'd add that it would be nice if a KDC connected to an LDAP server
>> could detect that the server went down, too, and automatically try
>> reconnecting, without needing to be prodded by Kerberos traffic to
>> notice.  I'm not familiar enough with the LDAP APIs to know if
>> that's
>> practical.
>
> If you have a main loop you can get the same sort of notification you
> get when any socket is close from the other side. (A reconnection  
> thread
> could easily do that)

I may have overlooked something, but as best I recall in the OpenLDAP  
API there wasn't a good way to get a handle on all the sockets that  
might be in use (maybe you could get one? but at least in theory more  
than one could be in use), and so you really needed to be in OpenLDAP  
library routines to notice, which is kind of at odds with having the  
main loop be in the KDC code.  We could cheat and only look at one  
file descriptor (if I recall correctly and we can get one fd number  
out), and if some other one is also in use and gets disconnected, oh  
well, we don't notice that until we try to use it... but we'd still  
need some sort of event notification callback added to the DAL  
interface.

At the time, I was more interested in asynchronous LDAP queries, where  
one thread could process both LDAP traffic and KDC database lookup  
requests.  I didn't see a way to guarantee that I knew about all the  
file descriptors in use by the LDAP library (so I could select on all  
of those plus one used to signal pending DB requests); you really  
seemed to need to turn control over to the LDAP library.

-- 
Ken Raeburn / raeburn at mit.edu / no longer at MIT Kerberos Consortium