issue with MIT KDC and LDAP DS
Ken Raeburn
raeburn at MIT.EDU
Sat May 23 17:45:21 EDT 2009
(Apologies for the length of this message... as the saying goes, I
don't have time to write a shorter one.)
On May 23, 2009, at 10:17, Simo Sorce wrote:
>> Several ideas that have been kicked around involve
>> multiple threads or processes, and would allow, say, the LDAP
>> database
>> implementation to manage its own parallelism and background tasks
>> without requiring hooks in the main KDC code. If that sort of big
>> restructuring project can be done, great! I'd love to see it happen.
>
> What need to happen to have it done ?
Some requirements and design discussions, first. :-) Several ideas
have been kicked around, including:
* multithreaded KDC (e.g., http://mailman.mit.edu/pipermail/krbdev/2005-March/003237.html
; Novell has donated some code, but changing single-threaded security-
sensitive code to be multithreaded makes some people nervous)
* multiprocess KDC split by functionality (e.g., main KDC process
talks to a database-access process, each made smaller and easier to
analyze, and independently we can assess whether each can safely be
made multithreaded or should use async APIs or whatever)
* multiprocess KDC running several independent instances in parallel
(relatively simple and quick to code: fork after opening sockets,
(re)open database connections after forking, let multiple processes
all grab for any available network traffic detected; this leads to
contention and wasted cycles, but only when more than one process is
idle; it also may not balance load well in certain circumstances)
* change DAL to support issuing query without blocking, getting
result later via callback
* make database back ends use async APIs for fetching data
* break KDC functionality into one or more library modules that can
be invoked in new ways (e.g., inetd-launched KDC) or integrated into
another process
* rewrite KDC as well as library to be friendlier to techniques such
as Apple's Grand Central and similar technologies, but without being
platform-dependent (okay, rewriting the KDC like this hasn't actually
gotten a lot of discussion that I've seen, but I expect it would
eventually come up)
Some of these changes would be relatively easy to implement by
themselves, in one shot or in a few steps, but they can interact in
bad ways, so we should have a plan for where we're going.
So far as I've seen, no one has pulled together a requirements list
for restructuring the KDC. I could throw out some mostly obvious
ideas, but as of when I left the Consortium, I think we'd talked to
various sponsors and potential sponsors about some ideas but hadn't
really assessed which were the most crucial, and hadn't talked to
everyone who might have an interest. So ideas need to be collected
and evaluated. It also would help to better understand the needs of
the various different target environments where the KDC would be used
-- normal standalone realm KDCs; Samba and IPA environments;
eDirectory environments. Mobile/embedded KDCs? (Maybe a KDC running
in one of those little Linux server wall-warts, just configure it and
plug it in in a locked closet. On a cell phone or PDA?) Cable system
head ends? Tamper-proof cards in computers deployed in insecure venues?
This list is a bit haphazard, but maybe it can help somebody get
started.
* more efficient use of multicore systems under heavy load (not the
top priority, as we rarely hear that CPU time is an issue)
* don't block one request because processing an earlier request is
blocking on some external factor; answer KDC requests out of order
* DB (LDAP) access is just one example
* PKCROSS would involve having the KDC wait on additional external
traffic
* principal alias processing rules could theoretically do DNSSec
queries
* recognize if a request duplicates one we already received but
not answered, and avoid duplicating work and traffic when possible
(unimportant now but more important if LDAP/PKCROSS/whatever delays
increase the likelihood of such events; I think Novell's patch deals
with this by extending the lookaside cache)
* theoretically, any plugin hook could block, including preauth
and authz-data mechanisms
* network reconfiguration could result in a socket being shut down
while we're still preparing to send out a response through it, if
we're not careful
* Should we cache DB results for performance? If so, at the KDC
level or buried in the DB layer and transparent to the KDC main code?
* minimal or zero CPU use in idle state (e.g., no frequent polling
at application level to make sure LDAP connection is still up, though
TCP keepalives may be okay, and one-shot timers such as for shutting
down extra worker processes are probably okay)
* fast response to clients when possible
* survive various DoS attack cases reasonably
* don't crash; be available shortly after attack stops, if not
immediately after or even during attack
* reduce backscatter from forged-source traffic? TCP helps
* don't chew up resources (memory, process slots) without bound?
* programmable configuration of some functionality
* current plugins (make list)
* extended 1.7 DAL operations and flags (make list), moving
towards general "infrastructure" interface
* still support various extensions without requiring one massive
integrated "database" back end?
* pluggable logging? e.g., log to database for later intrusion-
detection analysis; if serious auditing requirements exist, such that
the KDC is not permitted to issue unlogged credentials, errors in
logging may also need to be able to force errors up the call chain
* scriptable extensions (e.g., for describing rules for principal
alias processing or PKINIT mapping)? easier for sysadmins who are not
experienced C programmers; can one script define multiple plugin
functions that share data?
* notify on startup/shutdown hooks -- advertise via mDNS, DDNS,
anycast routing, whatever
* may want a way to feed data into a plugin through external means
(e.g., updated file with alias mappings, or new DNS zone file to parse
and derive mappings from); how to get its attention vs requiring it to
call stat() a lot?
* what else?
* code clarity, cleanliness, documentation (no, really),
maintainability
* security - probably most important; worth jumping through some
hoops for making buffer overruns impossible, sandboxing, whatever
* ease of security analysis - could be helped by breaking code into
well-separated modules, and not sharing address space; also by using
privilege separation, internal interfaces well suited for static
analysis tools, and other security practices
* don't require DB accessibility at startup, per Will F? (but, if
db2 database is missing, maybe we want to error out at startup? maybe
ldap server unavailability should be a "temp fail" condition and we
continue? server available but realm data non-existent would probably
be hard fail.)
* support multiple realms in one KDC process, possibly with
different configurations (e.g., LDAP vs DB2, different port numbers
advertised); OR, explicitly don't support it any more and reject the
command-line options
* flexibility in network setup
* current code listens on all local addresses (separately or via
PKTINFO options), reconfiguring itself at run time if needed when
addresses are added or removed
* inetd "wait"-type service for UDP or TCP
* forking for multiprocessing would mess with the auto-
reconfiguration
* have had request to specify subset of local IP addresses
* KDC, kadmin, kpasswd services should all use similar mechanism/
code and parallel config file/command line specs (if not the same
ones), as much as practical (e.g., kadmin is IPv4-only for now in MIT
code but that may be fixed later so options should allow for that)
* need we worry about kprop at the same time? probably not
* programming environment: Is it unreasonable to consider careful
use of C++ or Python for some pieces of the code, or are we stuck with
C and the dangers and annoyances that come with it? (For krb5 library
code, C++ still seems like a bad idea, but for non-library code like
the server programs, some of the issues don't apply.)
* integrate into another server process?
* would probably want to supply plugins programmatically instead
of requiring the loading of external shared objects
* would probably want to control options programmatically rather
than having to write out a config file
* embedded KDC might work this way, if not dealing with embedded
Linux or *BSD, or even then in an effort to reduce process count and
memory footprint
* might need to hand main-event-loop control over to other code
* has security implications that should be well documented, but
don't automatically make it a bad idea
* "KDC library" idea probably shouldn't involve a public/stable
API until we can shake out bugs
* record/report statistics for monitoring? multiprocess approaches
make this trickier, coordination needed
* any hooks desired for intrusion detection?
* fast and easy restart, in case of problems/bugs or changes in
configuration options only checked at startup; this implies clean and
quick shutdown of network so we can rebind the ports right away
* peak load requirements? e.g., at start of work day or after power
outage
We've talked a little, and I've talked with Don Davis a little, but I
don't know in great detail what your requirements (and nice-to-have
list) look like. Nor am I all that sure about what other environments
with unusual requirements (resource, performance, security, legal) the
KDC might get used in. That data should be gathered.
Once we can evaluate requirements, maybe some of the possible
approaches can be selected, or ruled out. E.g., my simple fork-after-
startup idea doesn't keep the load down as well under light usage,
doesn't minimize memory use or achieve privilege separation, and
doesn't allow for detecting that multiple identical requests are being
processed at once and could be optimized. So in the long term it's
probably not the way to go. But it could be a short path to taking
advantage of multiple CPU cores, if that's useful now. And if we move
to a clean "KDC library" interface, the fork-after-startup bit would
probably have become just a couple dozen lines of code that would be
easy to rip out when more efficient multi-process code was available.
Many of these items will probably become minor implementation notes
once a direction is chosen. But we need to keep some of them in mind
so we don't accidentally choose a direction that makes some of them
very hard.
Tom has also been talking about redesigning the database and its
interface, for various reasons. The existing arrangement operates in
old DBM mode -- here's a key and a blob of data, and the KDC code
knows how to parse and assemble the data. That doesn't let anyone run
independent tools over the database very effectively, nor can we ask
for only the subset of data we actually need. The LDAP back end makes
it a bit easier, because it knows how to store some of the KDC data in
special fields, but a lot of that is buried in the LDAP back end code
that just mirrors some of the code elsewhere. It would be nice to
take that further, and use, for example, sqlite3 or mysql (or one of
these under odbc) as a database interface, with multiple tables and
all that fun stuff, and only retrieve the data we really need, but
that means having a more clearly defined and stable set of fields. It
would also provide an opportunity to change how some things are
stored, so that for example hardware preauth schemes to use with a
particular principal aren't flagged by the existence of a secondary
principal in the database, with no keys and no purpose other than to
exist as a flag. And Luke's extension of the DAL to add things that
are not necessarily Kerberos-database items makes it clear that our
current plugin hooks are not sufficient for some sites' needs.
It's largely a separate project from changing the KDC startup and
network handling and parallelism and such, but again, if assumptions
are made on one side and violated on the other, we could hurt
ourselves, especially in such areas as forking or multithreading, or
in this case, not coordinating changes in the DAL interface.
>> I'd add that it would be nice if a KDC connected to an LDAP server
>> could detect that the server went down, too, and automatically try
>> reconnecting, without needing to be prodded by Kerberos traffic to
>> notice. I'm not familiar enough with the LDAP APIs to know if
>> that's
>> practical.
>
> If you have a main loop you can get the same sort of notification you
> get when any socket is close from the other side. (A reconnection
> thread
> could easily do that)
I may have overlooked something, but as best I recall in the OpenLDAP
API there wasn't a good way to get a handle on all the sockets that
might be in use (maybe you could get one? but at least in theory more
than one could be in use), and so you really needed to be in OpenLDAP
library routines to notice, which is kind of at odds with having the
main loop be in the KDC code. We could cheat and only look at one
file descriptor (if I recall correctly and we can get one fd number
out), and if some other one is also in use and gets disconnected, oh
well, we don't notice that until we try to use it... but we'd still
need some sort of event notification callback added to the DAL
interface.
At the time, I was more interested in asynchronous LDAP queries, where
one thread could process both LDAP traffic and KDC database lookup
requests. I didn't see a way to guarantee that I knew about all the
file descriptors in use by the LDAP library (so I could select on all
of those plus one used to signal pending DB requests); you really
seemed to need to turn control over to the LDAP library.
--
Ken Raeburn / raeburn at mit.edu / no longer at MIT Kerberos Consortium
More information about the krbdev
mailing list