[krbdev.mit.edu #2980] Windows XP SP2 and trouble with free library krb5_32.dll

bapruski@iastate.edu via RT rt-comment at krbdev.mit.edu
Fri Mar 25 16:29:07 EST 2005


Hello,

We are experiencing a problem with Windows XP SP2 which use the custom 
written Kerberos service (with an NPLogonNotify entry point) to obtain 
Kerberos tickets from an MIT Kerberos KDC at login time. What we see is 
that (some) machines with Windows XP SP2 after the reboot exhibit a very 
long pause right after the login prompt. The machine shows a blue screen 
and nothing is happening. The only way to get out if this state is via 
CTRL-Alt-Del. After that the desktop shows up and everything seems to work 
just fine. The thing is that after the logoff the next login is performed 
without the pause. Everything seems to work just fine. Only the first login 
after the reboot is problematic.
We did contact Microsoft and opened a trouble ticket. We have enabled a lot 
of debugging statements in the code to our Kerberos service (Network 
Provider) and we learned that the problem lies in the call to FreeLibrary, 
were we free library Krb5_32.dll. We send our report (see below, including 
the attachments) to Microsoft and got back their response (below) in which 
they say that the problem is in krb5_32.dll which is calling WSACleanup 
from DllMain.

Your help in resolving this problem will be greatly appreciated. Thanks

Beata Pruski


Here is the report of our tests written by my colleague (I am attaching the 
logs). Microsoft's response follows.


[...]
>We have spent quite a bit of additional time trying to diagnose this 
>problem since our last email exchange.  I have been working with the 
>developer of the custom Kerberos service (with an NPLogonNotify entry 
>point) that we have been using.  This service is used to obtain Kerberos 
>tickets from an MIT Kerberos KDC at login time.
>
>I am going to back off my statement of "I do NOT feel it relates directly 
>to the kerberos libraries we have installed ..." because I believe we have 
>proven the pause does originate within the custom Kerberos service (with 
>an NPLogonNotify entry point) that we have been using.  HOWEVER, I feel we 
>have a strong case that it is not our custom code but rather some strange 
>interaction of the Windows XP SP2 firewall with standard Windows system 
>calls (specifically the "FreeLibrary" call used to decrement the use count 
>(and eventually free) routines used in DLLs).
>
>In a nutshell, we believe the Windows XP SP2 firewall causes a call to 
>"FreeLibrary" to "lock up" in our case.  We hope we have tested and 
>documented this belief with this email.  Here is our "case" ...
>
>By working with the developer and having her add in several "debug 
>logging" statements in her code, we gradually worked our way to a small 
>section of code near the end of the NPLogonNotify routine caused the "lock 
>up".  Drilling down statement by statement we finally isolated the problem 
>to a call to "FreeLibrary" (a standard Windows call to free DLLs) that 
>never returned (hence the "long pause").  When we disabled the Windows XP 
>firewall (with no modifications to the code), the call to "FreeLibrary" 
>returned normally.  When we commented out the call to the routine that did 
>the "FreeLibrary" our code continued normally whether the firewall was 
>enabled or not.
>
>I have included several attachments to this email.  They are:
>
>1) Custom NPLogonNotify.txt -- Relevant source code that produced the test 
>output
>2) autologon.1.log -- Test output from test #1
>3) autologon.2.log -- Test output from test #2
>4) autologon.3.log -- Test output from test #3
>
>The tests performed were with two versions of our custom NPLogonNotify 
>routine, one which attempts to free a routine library before existing 
>(this is the one that locks up) and one that had the "FreeLibrary" call 
>commented out.  Specifically, the tests are as follows:
>
>Test 1:  *** This case LOCKS UP ***
>Windows XP Firewall ("Windows Firewall/Internet Connection Sharing (ICS)") 
>service is ENABLED.
>Using our NORMAL custom Kerberos Services with NPLogonNotify entry point.
>This version does "LoadLibrary" at start and "FreeLibrary" at end.
>(See "Custom NPLogonNotify.txt" for code relevant code snippets)
>
>Test 2:  *** This case DOES NOT lock up ***
>Windows XP Firewall ("Windows Firewall/Internet Connection Sharing (ICS)") 
>service is DISABLED.
>Using our NORMAL custom Kerberos Services with NPLogonNotify entry point.
>This version does "LoadLibrary" at start and "FreeLibrary" at end.
>(See "Custom NPLogonNotify.txt" for code relevant code snippets)
>
>Test 3:  *** This case DOES NOT lock up ***
>Windows XP Firewall ("Windows Firewall/Internet Connection Sharing (ICS)") 
>service is ENABLED.
>Using our normal custom Kerberos Services with NPLogonNotify entry point.
>This version does "LoadLibrary" at start BUT NO "FreeLibrary" AT END (call 
>commented out).
>(See "Custom NPLogonNotify.txt" for code relevant code snippets)
>
>You should be able to relate the "LoadLibrary" and "FreeLibary" debug log 
>file statements to the source code that generated it in the "Custom 
>NPLogonNotify.txt" file.  Our current thinking right now is that if the 
>system cleans up the "use count" for the DLL we may be able to get by 
>without doing the "FreeLibrary" call ourselves.  As you know, however, it 
>is proper programming practice to do a "FreeLibrary" during cleanup before 
>exiting.
>
>This certainly seems to indicate some interaction problem between the 
>Windows XP SP2 firewall and the "FreeLibrary" function (to us).


The response from Microsoft:
[...]

>The reason FreeLibrary() API is hanging because krb5_32.dll's DllMain is 
>calling WSACleanup() API in WS2_32.dll which is not safe as per 
>Microsoft's DllMain specification.
>WSACleanup finally calls ICF which does synchronization with the worker 
>thread.
>
>WSACleanup is not safe to call from DllMain.
>
>There are two threads in npnotify.exe
>
>The thread which is calling FreeLibrary holds the loader lock and is 
>waiting for the worker thread created by ICF to exit.
>The ICF worker thread is waiting for the loader lock.
>
>Essentially, a deadlock occurred in npnotify.exe.
>
>The root cause comes from krb5_32.dll!DllMain calling WSACleanup which is 
>not safe to call from DllMain. WSACleanup is internally doing synchronization.
>
>This is not allowed as per DllMain specification.
>
>Because DLL notifications are serialized, entry-point functions should not 
>attempt to communicate with other threads or processes.
>Deadlocks may occur as a result.

--------
Beata A. Pruski,    Systems Software & Microcomputer Network Services
Iowa State University, Academic Information Technologies, Ames, Iowa (USA)
268 Durham Center               Ph: 515-294-5919
Ames, IA 50011-2251             E-mail: bapruski at iastate.edu 


More information about the krb5-bugs mailing list