Adding external library dependencies to .pc files for static linking

Ken Hornstein kenh at cmf.nrl.navy.mil
Thu Feb 29 13:04:44 EST 2024


>I have been building it static for years, so yes I would be very interested
>to hear about your weird and sneaky problem details!

Here are the details:

- Kerberos itself was built with OpenSSL for crypto operations, so it had to
  link with OpenSSL.
- In our build at the time OpenSSL was also built statically.
- We use pkinit; pkinit is only available via a plugin, and the plugin
  links against both the Kerberos libraries and the OpenSSL libraries.

The problem came when PKINIT was invoked (e.g., kinit).  When the PKINIT
plugin was loaded you ended up with two copies of the OpenSSL libraries
in the running process (you also, as far as I can tell, ended up with
two copies of the Kerberos libraries in the running process but this
cause any observable problems).  And it turns out in this situation the
"other" version of OpenSSL ended up being invoked roughly half of the
time; it seemed to happen randomly.  I wouldn't have believed it myself
if I hadn't spent several hours running gdb on this.  But this also
happened INSIDE of OpenSSL; as in when you were in the OpenSSL library
sometimes it would access variables/functions in the "other" OpenSSL
copy.  The ultimate cause of the crash was it was checking to see if
some pointer was initialized, and in library copy 'A' it was marked as
initialized and in this code path it was checking the flag variable
in library copy 'A'.  But then it tried to access the pointer copy in
library copy 'B' and it was a NULL pointer and would crash (I forget
exactly which variable it was).

The surprising thing was that this worked for years, and it was only
during a very minor bugfix in unrelated Kerberos code when it decided to
break.  But after I dug into this and did some research I realized that
what is supposed to happen in this situation is not really specified,
as far as I can tell; what IS supposed to happen, exactly, when you
dlopen() a plugin that is statically linked against the the same library
that is statically linked into the main executable?  I realized that this
was a time bomb waiting to happen and that the only reasonable solution
was to go dynamic for everything.  Yes, OpenSSL uncovered this issue,
but there was no guarantee the same thing wouldn't happen with the
Kerberos libraries in the future, and there is some core Kerberos
functionality that is only available in plugins now.

This all took place on CentOS 7.  I am aware there are some linker
flags that may have resolved this, but it would have required some
experimentation, we were under a time crunch, those wouldn't have
been portable and this experience made me reassess the whole idea of
statically building Kerberos and I realized the reasons for doing static
builds weren't so solid anyway and dynamic was really the way it was
supposed to work.

So yes, it may work fine; it did for us for a while.  Until it didn't.

--Ken


More information about the krbdev mailing list