[krbdev.mit.edu #8556] missing primary cache after kdestroy
Charles Hedrick via RT
rt-comment at krbdev.mit.edu
Fri Mar 3 17:09:27 EST 2017
Submitter-Id: hedrick
Originator: Charles Hedrick
Organization: Rutgers University
Confidential: no
Synopsis:primary cache sometimes missing
Severity: critical
Priority: medium
Release: 1.14
Environment: Centos 7
[I should note that I havenât actually seen NFS fail as described below. Rather, Iâve been testing the way credentials caches work carefully before deploying Kerberized NFS.]
In Kerberized NFS on Centos, GSS uses the primary credentials cache. If that cache becomes unusable, jobs can lose access to NFS.
The code in cc_keyring.c, get_primary is intended to set up a primary cache when cc_resolve is called to find the default cache. Unfortunately all it does is check whether the collection has designated a primary. It doesnât check that the cache referred to actually exists and is usable. If you do a kdestroy of the primary cache (and I think also if the cache expires) the collection can have a primary pointing to a cache that no longer exists.
This means that even an auto renewal program such as k5start may not be good enough to make sure that a job always has NFS access. NFS doesnât pay attention to KRB5CCNAME. It simply calls GSS with arguments that cause it to look for the default cache. If that cache doesnât exist, GSS will fail, even though KRB5CCNAME for the job may point to a usable cache.
It may actually be safest not to use the keyring at all, but to use files in /tmp. gssd will try to use GSS to find a usable cache, but if it fails, it will search /tmp. Because /tmp doesnât have a concept of primary, if checks all caches. That allows multiple sessions not to interfere with each other, and NFS to work if any session has a valid cache.
Iâd like to see get_primary smartened so that it checks to see whether the primary cache is real, and if not, sets the primary to something usable. You could try to fix it by making the destroy operation remove the primary pointer if itâs destroying the primary. But that wonât handle expirations, which happen in the kernel without activating any code in the Kerberos library.
More information about the krb5-bugs
mailing list