Fwd: Gss context refresh failure due to clock skew
William.Adamson at netapp.com
Thu Oct 1 17:30:21 EDT 2015
Bringing in the NFSv4 and Kerberos list.
> Begin forwarded message:
> From: Andy Adamson <andros at netapp.com>
> Subject: Gss context refresh failure
> Date: October 1, 2015 at 4:39:06 PM EDT
> To: Bruce Fields <bfields at fieldses.org>
> Cc: Simo Sorce <simo at redhat.com>
> Hi Bruce, Simo
> We are seeing a failure on GSS context renewal between the Linux client and NetApp, so I tested Linux client to Linux server and see a similar situation.
> The situation occurs as follows.
> 1) For simplicity , set the client and server to both have the same Kerberos clock skew - I leave the default 5 minutes.
> 2) For convenience, I set the TGS lifetimes to be as short as possible, 10 minutes for Win2008R2 AD which I test with.
> 3) Client clock is set to the KDC (WinAD) clock via NTP.
> 4) Set the server clock to be 2-3 minutes ahead of the client clock, but still within the Kerberos clock skew.
> 5) Mount NFS with -o sec=krb5 - this succeeds because clocks are within the Kerberos clock skew
> 6) On the client klist -ce the machine creds to note the server TGS expiry time
> 7) On the server, wait until the client’s serverTGS expiry time has just passed. Note at this time that according to the client clock, the TGS has 2-3 minutes of being valid, but on the server, the TGS will have expired.
> 8) On the client, make a new directory: mkdir /mntpoint/blah
> The client sends an NFS ACCESS (for the mkdir) call which gets back an AUTH_ERROR, GSS credential problem.
> This pokes the client to do an upcall to refresh the GSS context.
> 9) The client sends a NULL call RPCSEC_GSS_INIT, using the TGS in the credential cache (no call to KDC for new TGS) because the client clock says the TGS is still valid.
> 10) ONTAP: The server replies to the NULL call with an a GSS minor status of GSS_S_CREDENTIAL_EXPIRED.
> 10) LINUX: The server does not reply to the NULL RPCSEC_GSS_INIT call, in fact, the Linux server sends a FIN to the client gssd connection.
> 11) The mkdir request gets permission denied
> 12) Wait until the client clock is past the server TGS expiry time
> 13) re-try the mkdir - it succeeds after a successful GSS INIT NULL call exchange for both servers.
> In the ONTAP case, the accept_sec_context call into the MIT libraries fails (even though krb5_check_clockskew() is apparently called). I believe the gss_proxy on Linux also just calls into the MIT libraries.
> Shouldn’t these refresh calls succeed? Isn’t the Kerberos clock skew supposed to handle this situation?
More information about the krbdev