Regarding Issues with Memory Credential Cache
Ezra Peisach
epeisach at MIT.EDU
Tue Aug 19 21:49:51 EDT 2008
Datar, Ashutosh Anil wrote:
> Hi,
>
> I was testing Apache Web Server (which uses mod_auth_kerb) with Kerberos Client 1.6.2 and found some issue with the Memory Cache handling.
>
> According to my understanding, memory cache is arranged as a linked list, each node of which represents a cache and provides a pointer to another linked list (list of credentials), each node representing a credential from that particular cache.
> (In source code, these nodes are named as Krb5_mcc_list_node and Krb5_mcc_cursor respectively)
>
> Now, as an already filed ticket (#5895: http://krbdev.mit.edu/rt/Ticket/Display.html?user=guest&pass=guest&id=5895) indicates, we need to place locks in krb5_mcc_initialize () and krb5_mcc_destroy () so that krb5_mcc_free doesn't get called unprotected, causing an application to receive SIGSEGV.
>
>
I will look at this. I have been playing with the ccache code recently...
> But, as mentioned in the ticket itself, this alone will not ensure the safe access, as a thread can still free a Krb5_mcc_list_node when another is still accessing it. And thus it requires some kind of reference count mechanism which will ensure freeing up Krb5_mcc_list_node happens only when refcount is zero (No one else accessing it). This is already implemented in File Cache handling.
>
>
Should be easy to implement.
> While testing, I could still observe application receiving SIGSEGV at few other places and after analyzing it, following are my observations:
>
>
> 1. Implemented reference count mechanism only ensures the safe access to Krb5_mcc_list_nodes, but when two threads are using same cache (e.g. two threads of same application) then they are actually simultaneously accessing Krb5_mcc_cursors, traversal of which is not completely protected by locks.
> 2. Though, krb5_mcc_remove_cred () is not supported, a thread can still call krb5_mcc_initialize () while another thread is still using those credentials in krb5_mcc_next_cred (), which doesn't operate under any kind of lock.
> 3. krb5_mcc_get_principal () also tries to access a Krb5_mcc_list_node data without acquiring any lock on it.
>
>
I think point (2) is an interesting issue... First of all - I think
initializing a cache from underneath a different thread is not really
socialable - but consider the file cache... In this case, the
fcc_initialize function will unlink the old cache and open a new one.
Now - a previously open cache in say another thread - will be screwed -
unless the the OPENCLOSE flag is not set - in which case the O/S will
still have access to the open file - but this is not supported. Even
worse, if the first thread initializes the cache and writes a bunch of
data to it before the other thread iterates to the next entry - the
position in the file may not longer be valid.
At best - you are facing an undefined state.
So - to be consistent, I would say that if you initialize a cache while
another thread is iterating through it - the other thread should be
screwed - but
should fail in a reliable way - without crashing the application.
An examination of the Heimdal code shows that initialize will probably
screw you over as. I think you might still have access to the old
entries - as they are not cleared. They do implement a remove function.
> As for point number 2 above, following are possible approaches that I see as solution:
>
> 1. When a thread enters krb5_mcc_start_seq_get () to start the traversal of Krb5_mcc_cursor's list, it acquires a lock, which is relinquished in krb5_mcc_end_seq_get () so as to ensure that krb5_mcc_intialize () doesn't free the list when it's being traversed. But, this approach requires locking/unlocking across different functions and I am not very sure if it's the right way or not.
> 2. To make sure that krb5_mcc_initialize () is called only once for a particular Krb5_mcc_list_node. But there can be situations where an application explicitly wants to refresh(flush) the Cache and here we are restricting it to do so.
>
> Can someone let me know which one is the better option, or if something else can be implemented which will remove both of the mentioned problems?
>
>
My solution for you would be to implement the remove function. An
alternative would be to copy the cache from one entry to another -
skipping the entry you want to delete - and then tell all threads to
start using the new cache - and then destroy the old one.
> Also, about the 3rd point in the observations' list, it's required that there is a lock acquired in krb5_mcc_get_principal () before accessing a node so as to avoid accessing an already freed location.
>
>
I will take a look at this...
> Thanks and Regards,
> Ashutosh
>
>
>
There is another screw case to consider for which the file cache does
not deal either... If you destroy a cache in one thread while accessing
in another thread - bad things could happen. Heimdal handles the
destroy case in an interesting way (for a memory cache) - it will flag a
cache in use as dead - and removed from the global list. When the last
thread holding it open closes the cache, the refcount drops to zero and
is then deleted.
Therefore - two threads can use a shared cache - one thread can destroy
it - and the other thread is ok...
Anyways - I will look into the locking problems you mentioned - and see
about implementing the remove function - which I believe will solve your
problems....
Ezra
More information about the krbdev
mailing list