[krbdev.mit.edu #8556] missing primary cache after kdestroy

Charles Hedrick via RT rt-comment at krbdev.mit.edu
Fri Mar 3 17:09:27 EST 2017


Submitter-Id: hedrick
Originator: Charles Hedrick
Organization: Rutgers University
Confidential:	no
Synopsis:primary cache sometimes missing
Severity: critical
Priority: medium
Release:	1.14
Environment: Centos 7

[I should note that I haven’t actually seen NFS fail as described below. Rather, I’ve been testing the way credentials caches work carefully before deploying Kerberized NFS.]

In Kerberized NFS on Centos, GSS uses the primary credentials cache. If that cache becomes unusable, jobs can lose access to NFS.

The code in cc_keyring.c, get_primary is intended to set up a primary cache when cc_resolve is called to find the default cache. Unfortunately all it does is check whether the collection has designated a primary. It doesn’t check that the cache referred to actually exists and is usable. If you do a kdestroy of the primary cache (and I think also if the cache expires) the collection can have a primary pointing to a cache that no longer exists.

This means that even an auto renewal program such as k5start may not be good enough to make sure that a job always has NFS access. NFS doesn’t pay attention to KRB5CCNAME. It simply calls GSS with arguments that cause it to look for the default cache. If that cache doesn’t exist, GSS will fail, even though KRB5CCNAME for the job may point to a usable cache.

It may actually be safest not to use the keyring at all, but to use files in /tmp. gssd will try to use GSS to find a usable cache, but if it fails, it will search /tmp. Because /tmp doesn’t have a concept of primary, if checks all caches. That allows multiple sessions not to interfere with each other, and NFS to work if any session has a valid cache.

I’d like to see get_primary smartened so that it checks to see whether the primary cache is real, and if not, sets the primary to something usable. You could try to fix it by making the destroy operation remove the primary pointer if it’s destroying the primary. But that won’t handle expirations, which happen in the kernel without activating any code in the Kerberos library.





More information about the krb5-bugs mailing list