Concurrency issue in replay cache / GSS-API Error: "Permission denied in replay cache code"

Barbat, Calin c.barbat at osram.de
Mon Mar 7 05:01:44 EST 2005


Dear friends,

I implemented a SAP R/3 SNC scenario based on MIT Kerberos 5 1.3.5 and, after running without any problems for two months, today I am seeking your help/advice.

When there are many active work processes (appl. server with ca. 40 - 60 work processes) I get the following message in the dev_w<n> of some work process:

N GSS-API(maj): Miscellaneous failure
N GSS-API(min): Permission denied in replay cache code

Have you experienced this problem/situation before? If yes, what is the solution/workaround you came up with?

Many thanks and kind regards,

Calin Barbat.
--------------------------------------------------------
PS: If you are interested in the history of how I got here I'll provide some background information in what follows:

Sam Hartman - Kerberos developer - at MIT replied to one of my kerberos mailing list messages (see http://mailman.mit.edu/pipermail/kerberos/2004-September/006214.html):
----------------------- cut here -----------------------
"I know some people here at MIT are doing this successfully.  They do
not subscribe to this list.  I'm replying mainly to let you know that
it is possible and to wish you luck finding a consultant who can help
you set it up."
----------------------- cut here -----------------------

I asked him to put me in contact to them, but I got no answer. Therefore I contacted Ronald E. Parker (R3 admin at MIT) and he recommended as a solution to my problem using some 25 work processes instead of 40. Some colleagues of mine will try this way.

Excerpt from the trace file of the work process follows:
----------------------- cut here -----------------------
N File "/usr/local/lib/snckrb5.so" dynamically loaded as SNC-Adapter.
N The Adapter identifies as:
N External SNC-Adapter (Rev 1.0) to Kerberos 5/GSS-API v2
N SncInit(): found snc/identity/as=p:service at DOMAIN.COM
N*** ERROR => SncPAcquireCred()==SNCERR_GSSAPI [sncxxall.c 1216]
N GSS-API(maj): Miscellaneous failure
N GSS-API(min): Permission denied in replay cache code
N Could't acquire ACCEPTING credentials for
N
N name="p:service at DOMAIN.COM"
N SncInit(): Fatal -- Accepting Credentials not available!
N <<- SncInit()==SNCERR_GSSAPI
N sec_avail = "false"
M ***LOG R19=> ThSncInit, SncInit ( SNC-000004) [thxxsnc.c 219]
M *** ERROR => ThSncInit: SncInit (SNCERR_GSSAPI) [thxxsnc.c 221]
M in_ThErrHandle: 1
M *** ERROR => SncInit (step 1, th_errno 44, action 3) [thxxhead.c 8015]
----------------------- cut here -----------------------

I searched the Net for "Permission denied in replay cache code" and found an interesting topic from Sam Hartman on http://mailman.mit.edu/pipermail/krbdev/2004-April/002630.html:
----------------------- cut here -----------------------
    Daniel> Hi, is the replay cache mechanism supposed to work with
    Daniel> multiple processes accessing the same rc_* file at the
    Daniel> same time? 

    Sam> Yes.

    Daniel> I've run into problems working on the kerberos
    Daniel> module for apache (modauthkerb.sf.net), which verifies the
    Daniel> user's password against KDC. 

    Sam> However it doesn't actually work.  This is a known problem.

    Sam> For Kerberos 1.3.x I'd recommend creating some sort of application
    Sam> level lock if possible around calls to gss_accept_sec_context and
    Sam> krb5_verify_init_creds.
----------------------- cut here -----------------------
As far as I understood it, Sam states that the application (R/3 in my case) would have to take care of appropriate locks around the GSS-API calls in order to avoid problems with the concurrent access through the various work processes. The rcache code in MIT Kerberos 1.4 seems to be untouched since over 2 years now, and rc_io.c doesn't seem to deal with concurrency at all.

Next, I opened a call to SAP to ask about their opinion on this topic. Martin Rex replied:
----------------------- cut here -----------------------
Dear Calin, 

>From what I remember, one of the reasons for "Permission denied in 
replay cache code" happens if there's a file access permission problem 
with the file-based replay cache of MIT Kerberos 5, either when trying 
to read the replay cache, when trying to write/update it, or when 
trying to "clean" it (by writing a new file with only the non-expired 
cache entries and replacing the existing file, relying on the atomic 
operation of the Posix "rename" syscall to not cause interference 
for parallel access). 

There seems to be a slight misunderstanding about the "locking" 
performed by SNC in the SAP work processes. The only locking that 
SNC does is specifically for operation within multi-threaded 
processes, because the use of GSS-API is undefined for multi-threading. 
However the work processes of an SAP Application server have always 
been single-threaded and therefore SNC doesn't do any locking there. 

The synchronization/serialization of access to those shared resources 
that are being used inside a particular gssapi mechanism, such as 
MIT Kerberos 5, is an implementation detail of that mechanism, and 
for use with SAP R/3, it must be implemented by a gssapi mechanism 
in a fashion that is safe to use in a multi-process server. 

If you don't see a file permissions problem with the replay cache 
file, then you probably need to ask for help in a Kerberos forum. 

As I indicated in our Email exchange in august/september last year, 
it has been quite a while that I've played around with MIT Kerberos 5, 
and file permission (actually file owner) problems of the replay cache 
file were the only error that I encountered during my (admittedly 
non-representative) tests in 1996. 

Regards, 
-Martin Rex 
Development Support, R/3 and SAP WebAS Security 
----------------------- cut here -----------------------
Now it seems to me as if R/3 leaves it to Kerberos and viceversa, with none of them implementing it. 

My personal feeling is that this change needs to be made in Kerberos because then all Kerberos GSS-API based applications would greatly benefit. Besides, a library should - in my opinion - allow multiple calls at the same time.

Many thanks again,

Calin Barbat.




More information about the krbdev mailing list