living in a multi-mech world

Tom Yu tlyu at MIT.EDU
Mon Apr 30 00:12:50 EDT 2007


At some point soon we would like to standardize the interface for
plug-in modules for GSS-API mechanisms.  This presents a number of
interesting problems.  We are interested in making the implementation
approach as portable as possible, which means that we need to clearly
identify the particular (hopefully minimal) requirements we impose
upon the compilation system and runtime environment.  Note that most
of my experience is with dlopen() and related functions on Solaris and
POSIX systems, but BY NO MEANS do we assume the full functionality of
the Solaris runtime linker.

We have given some thought to pursuing the concept of making the
GSS-API C bindings be the actual SPI for plug-in modules, i.e., the
ABI which each module exports would exactly correspond to the GSS-API.
This has the advantage that an application could link against either
the mechglue layer or directly against a specific GSS-API mechanism
implementation.

I'll provide a set of working assumptions to start with, followed by a
list of the questions it might be most useful to have answers for.
There are also descriptions of some of the problems we may face in
this space.

To provide a uniform basis for discussion, let's assume:

* A "dynamic" or "shared" library on a platform can also function as a
  runtime-loadable object.

* A program can load an object at runtime without the loaded object
  altering the meaning of other symbols used in the program; this
  corresponds to the RTLD_LOCAL flag to the dlopen() interface.

* A program can look up symbols in a runtime-loaded object by passing
  strings to an interface like dlsym().

* A GSS-API mechanism plug-in module exports symbols having the names
  defined by the GSS-API C bindings specification and behaves
  accordingly.

* A shared object loaded at runtime by a program may end up binding
  its symbol references, including references to symbols defined in
  that same shared object, against the main executable or against
  libraries which are dependencies of the main executable.  Part of
  the reason this behavior may be desirable is to permit symbol
  interposition.  Some systems have ways to suppress this global
  search, e.g., RTLD_GROUP or "direct binding", but I think that for
  portability reasons we can't assume that functionality.

* A mechglue implementation presents the GSS-API to its callers, and
  itself calls other mechanism implementations, preferably by using
  the GSS-API.

If a mechglue implementation requires that its plug-in modules export
the GSS-API, we may have the following consequences:

* A mechglue implementation will call its plug-in modules by doing the
  equivalent of dlopen() on each module and using dlsym() to get
  function pointers for each GSS-API entry point.

* A mechanism plug-in module must not call its own entry points by
  using the GSS-API symbol names; it should use its own internal names
  for its own entry points.  I think typically a mechanism does not
  need to call its own GSS-API entry points though.

==============================

Questions:

Do we require RTLD_GROUP or "direct binding" functionality?

Do we preclude symbol interposition upon mechanism plug-in modules,
e.g., do we preclude the sort of symbol interposition that a user
might want to use for malloc debugging?

Do we require pseudo-mechanisms such as SPNEGO to make GSS-API calls
through function pointers obtained via a dlsym() equivalent?  (This
might become unwieldy.)

Do we have mechanism plug-in modules export symbols which are GSS-API
entry point names systematically transformed, e.g., adding a short
mechanism name prefix?

Do we have mechanism plug-in modules export a function which returns a
struct full of function pointers?  (This is similar to what our
Sun-derived mechglue implementation currently does.)

What additional interfaces (if any -- preferably none) beyond the
GSS-API do we require a mechanism to export?

What additional interfaces (if any -- preferably none) beyond the
GSS-API do we require the mechglue layer to provide to plug-in
modules?

==============================

Problems with explicitly linking a mechanism with mechglue:

If one of the plug-in modules is a pseudo-mechanism such as SPNEGO,
that pseudo-mechanism probably needs to make GSS-API calls to the
mechglue layer.  It could depend explicitly on the mechglue library
and make normal calls to the GSS-API entry points, but doing so has
some risks.

Consider an application that links against a specific (non-mechglue)
mechanism library.  The specific mechanism library exports the
GSS-API.  The application loads another library which depends on the
mechglue library.  At some point, the library which depends on the
mechglue library calls into the SPNEGO implementation by way of the
mechglue library.  The SPNEGO library calls a GSS-API entry point,
expecting to get the mechglue library, but instead gets the specific
mechanism which the application explicitly depends on.

One way to circumvent this problem is to have the SPNEGO library
dlopen() the mechglue library and call the mechglue through function
pointers, but that option seems distasteful.

When using this "GSS-API as SPI" approach, issues arise, particularly
when stackable mechanisms (or any mechanism that needs to call another
mechanism by using the GSS-API) are involved.

If a mechanism needs to call another mechanism by using the GSS-API,
and the calling mechanism itself exports the GSS-API, having the
calling mechanism declare the called mechanism library as a dependency
results in conflicting symbol definitions.  Whether the calling
library ends up calling code in itself or in its dependent mechanism
library for a given GSS-API entry point then depends on a number of
different factors.

I believe most linkers will take the first definition of a given
symbol when building shared libraries, which means that if a
pseudo-mechanism links against a real mechanism library, it generally
cannot call a GSS-API entry point in the real mechanism, as it will
get its own definition of the entry point instead.  If the
pseudo-mechanism itself has been loaded at runtime into a program
linked against yet another GSS-API library, the pseudo-mechanism's
calls to GSS-API entry points might even end up calling the GSS-API
library originally linked into the program.

General problems with multi-mechanism implementations:

As an additional example of some of the sorts of problems we have to
solve when dealing with a multi-mechanism implementation, consider the
problem of gss_release_buffer():

gss_buffer_desc doesn't have any way to identify which mechanism may
have allocated the memory to which it points.  In a multi-mechanism
implementation, this means either forcing all mechanisms to use the
same memory allocator (intractable in various situations, including
Windows DLLs, from what I understand) or somehow registering which
memory blocks belong to which mechanisms so that mechglue can call the
correct gss_release_buffer().

If mechglue registers pointers, things get even more complicated when
a pseudo-mechanism such as SPNEGO calls an actual mechanism through
mechglue and gets a gss_buffer_desc filled in by that actual
mechanism, then passes that value back up through the mechglue
instance that called the pseudo-mechanism.  It's probable that both
mechglue instances actually occupy the same process address space, and
are effectively indistinguishable.  This means that the mechglue may
have to register a block of memory multiple times and perform
reference counts.

There are also related problems with OID sets and minor status codes,
among other things.



More information about the krbdev mailing list