Erratic behavior of full resync process

Leonard J. Peirce leonard.peirce+kerberos at wmich.edu
Thu Jun 4 15:24:18 EDT 2015


[[NOTE:  I originally posted this reply to comp.protocols.kerberos but it
   doesn't appear to have made it to the mailling list.  That'll teach me. :-)]]

I continue to have the same problems (unfortunately I was unable
to participate in the latest teleconference).  I have found that I
get the same results on CentOS 6.6 running in a VM or on actual hardware.
And since I'm getting the same results with the standard CentOS RPMs
or the latest release (1.13.2) built from source I've settled on
testing with the latest release on physical hardware.

While trying to isolate the problem I've encountered a scenario that
is somewhat strange.  The master and slave were in sync when I stopped
all daemons on both servers.  I thought that upon restarting them
(without loading the database with kdb5_util or making any updates
in kadmin) kpropd and kadmind would connect and quickly figure out
everything was in sync and everything would be nominal.  But kpropd
failed to initialize and connect for some 100 minutes.

Given that previously kpropd did connect and the slave had been
initialized why would just a restart of the daemons cause kpropd to
fail to initialize so many times and then, suddenly, succeed?

Below is syslog and trace information (from KRB5_TRACE) for both
kadmind and kpropd as well as a few comments.  A summary:

   13:50:38 -- kadmind starts on master
   13:51:39 -- kpropd starts on slave; for the next 1 hour 40 minutes kpropd repeatedly
               tries and fails to initialize until 15:30:19 when syslog entries stop
   15:30:57 -- kadmind complains in syslog about clock skew
   15:30:57 -- trace for kadmind is finally updated, showing failure, claiming it cannot
               find key for kadmin/admin at WMICH.EDU kvno 2 in keytab
   15:35:19 -- trace for kadmind finally shows success
   15:35:19 -- syslog for kadmind shows everything nominal

Stop here if you're not interested in the gory details of syslog and the trace
information.  My question is still the same:  is anyone else seeing anything
like this?  My next step is to try this on CentOS 7.

--
Leonard J. Peirce
Western Michigan University
Office of Information Technology
Kalamazoo, MI  49008


At 13:50:38 kadmind starts on master.  kadmind syslog at startup:

    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: setting up network...
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 10: udp 0.0.0.0.464 (pktinfo)
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(11,IPV6_V6ONLY,1) worked
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 11: udp ::.464 (pktinfo)
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(12,IPV6_V6ONLY,1) worked
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 13: tcp 0.0.0.0.464
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 12: tcp ::.464
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 14: rpc 0.0.0.0.2088
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(15,IPV6_V6ONLY,1) worked
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 15: rpc ::.2088
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 16: rpc 0.0.0.0.749
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(17,IPV6_V6ONLY,1) worked
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 17: rpc ::.749
    Jun  3 13:50:38 master.dev.admin.private kadmind[15602]: set up 8 sockets
    Jun  3 13:50:38 master.dev.admin.private kadmind[15603]: Seeding random number generator

Trace output of kadmind on startup:

    [15602] 1433353838.676525: Retrieving K/M at WMICH.EDU from FILE:/var/kerberos/krb5kdc/.k5.WMICH.EDU (vno 0, enctype 0) with 
result: 0/Success
    [15602] 1433353838.774319: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.774525: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.774704: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.774895: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.775082: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.775258: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.775435: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.775611: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.775802: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.775992: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.776167: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.776343: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.776525: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.776699: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.776885: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
    [15602] 1433353838.777063: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success

At 13:51:39 kpropd starts on slave.

kpropd syslog at startup.  The number of failures from kpropd varies.
Sometimes it is only a few, sometimes the failures will continue for hours:

    Jun  3 13:52:34 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing 
/opt/krb5/sbin/kpropd interface, retrying
    Jun  3 13:53:33 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing 
/opt/krb5/sbin/kpropd interface, retrying
    Jun  3 13:54:36 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing 
/opt/krb5/sbin/kpropd interface, retrying
    Jun  3 13:55:47 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing 
/opt/krb5/sbin/kpropd interface, retrying
    Jun  3 13:57:15 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing 
/opt/krb5/sbin/kpropd interface, retrying

trace output of kpropd; this pattern of log messages repeats for each failure of
kpropd to initialize:

    [2296] 1433353899.702011: Initializing MEMORY:kadm5_0 with default princ kiprop/slave.dev.admin.private at WMICH.EDU
    [2296] 1433353899.702053: Getting initial credentials for kiprop/slave.dev.admin.private at WMICH.EDU
    [2296] 1433353899.702212: Setting initial creds service to kiprop/master.dev.admin.private
    [2296] 1433353899.702278: Looked up etypes in keytab: des-cbc-crc, des3-cbc-sha1, aes256-cts, aes128-cts, rc4-hmac
    [2296] 1433353899.702342: Sending request (235 bytes) to WMICH.EDU
    [2296] 1433353899.702377: Resolving hostname slave.dev.admin.private
    [2296] 1433353899.702862: Sending initial UDP request to dgram 172.30.210.37:88
    [2296] 1433353899.703396: Received answer (880 bytes) from dgram 172.30.210.37:88
    [2296] 1433353899.703420: Response was not from master KDC
    [2296] 1433353899.703443: Processing preauth types: 19
    [2296] 1433353899.703450: Selected etype info: etype aes256-cts, salt "WMICH.EDUkipropslave.dev.admin.private", params ""
    [2296] 1433353899.703457: Produced preauth for next request: (empty)
    [2296] 1433353899.703461: Getting AS key, salt "WMICH.EDUkipropslave.dev.admin.private", params ""
    [2296] 1433353899.703500: Retrieving kiprop/slave.dev.admin.private at WMICH.EDU from FILE:/etc/krb5.keytab (vno 0, enctype 
aes256-cts) with result: 0/Success
    [2296] 1433353899.703515: AS key obtained from gak_fct: aes256-cts/773C
    [2296] 1433353899.703557: Decrypted AS reply; session key is: aes256-cts/9602
    [2296] 1433353899.703574: FAST negotiation: available
    [2296] 1433353899.703588: Initializing MEMORY:kadm5_0 with default princ kiprop/slave.dev.admin.private at WMICH.EDU
    [2296] 1433353899.703598: Storing kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU in 
MEMORY:kadm5_0
    [2296] 1433353899.703610: Storing config in MEMORY:kadm5_0 for kiprop/master.dev.admin.private at WMICH.EDU: fast_avail: yes
    [2296] 1433353899.703624: Storing kiprop/slave.dev.admin.private at WMICH.EDU -> 
krb5_ccache_conf_data/fast_avail/kiprop\/master.dev.admin.private\@WMICH.EDU at X-CACHECONF: in MEMORY:kadm5_0
    [2296] 1433353899.704390: Getting credentials kiprop/slave.dev.admin.private at WMICH.EDU -> 
kiprop/master.dev.admin.private at WMICH.EDU using ccache MEMORY:kadm5_0
    [2296] 1433353899.704414: Retrieving kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU from 
MEMORY:kadm5_0 with result: 0/Success
    [2296] 1433353899.704454: Creating authenticator for kiprop/slave.dev.admin.private at WMICH.EDU -> 
kiprop/master.dev.admin.private at WMICH.EDU, seqnum 108307488, subkey aes256-cts/3CC8, session key aes256-cts/9602
    [2296] 1433353924.730934: Getting credentials kiprop/slave.dev.admin.private at WMICH.EDU -> 
kiprop/master.dev.admin.private at WMICH.EDU using ccache MEMORY:kadm5_0
    [2296] 1433353924.730974: Retrieving kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU from 
MEMORY:kadm5_0 with result: 0/Success
    [2296] 1433353924.731019: Creating authenticator for kiprop/slave.dev.admin.private at WMICH.EDU -> 
kiprop/master.dev.admin.private at WMICH.EDU, seqnum 535019415, subkey aes256-cts/C077, session key aes256-cts/9602

At 15:30:57 (100 minutes after initial startup), the kadmind complains in syslog
about clock skew even though both master and slave are running NTP:

    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: starting
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: 172.30.210.37, GSS-API error strings are:
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Unspecified GSS failure.  Minor code may provide more information
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Clock skew too great
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:    GSS-API error strings complete.
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: (unknown), GSS-API error strings are:
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Unspecified GSS failure.  Minor code may provide more information
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in keytab 
(request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:    GSS-API error strings complete.
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: closing down fd 35
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: 172.30.210.37, GSS-API error strings are:
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Unspecified GSS failure.  Minor code may provide more information
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Clock skew too great
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:    GSS-API error strings complete.
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: (unknown), GSS-API error strings are:
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Unspecified GSS failure.  Minor code may provide more information
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in keytab 
(request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:    GSS-API error strings complete.
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: closing down fd 35
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: 172.30.210.37, GSS-API error strings are:
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Unspecified GSS failure.  Minor code may provide more information
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Clock skew too great
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:    GSS-API error strings complete.
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: (unknown), GSS-API error strings are:
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Unspecified GSS failure.  Minor code may provide more information
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:     Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in keytab 
(request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]:    GSS-API error strings complete.
    Jun  3 15:30:57 master.dev.admin.private kadmind[15603]: closing down fd 35

kpropd continues to fail and the above syslog entries repeating until
15:30:19 (1 hour and 40 minutes later) when syslog for kpropd goes silent.

At 15:30:57 the trace for kadmind is finally updated, showing failure:

    [15603] 1433359857.235560: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with 
result: 0/Success
    [15603] 1433359857.235623: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU: 
des-cbc-crc/D2AB
    [15603] 1433359857.235635: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU, 
session key aes256-cts/9602
    [15603] 1433359857.253258: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: -1765328154/Key 
version number for principal in key table is incorrect
    [15603] 1433359857.253277: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/admin at WMICH.EDU kvno 2 in 
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    [15603] 1433359857.253376: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: 
-1765328154/Key version number for principal in key table is incorrect
    [15603] 1433359857.253387: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in 
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    [15603] 1433359857.253945: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with 
result: 0/Success
    [15603] 1433359857.253972: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU: 
des-cbc-crc/D2AB
    [15603] 1433359857.253980: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU, 
session key aes256-cts/DAEB
    [15603] 1433359857.272560: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: -1765328154/Key 
version number for principal in key table is incorrect
    [15603] 1433359857.272576: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/admin at WMICH.EDU kvno 2 in 
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    [15603] 1433359857.272689: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: 
-1765328154/Key version number for principal in key table is incorrect
    [15603] 1433359857.272706: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in 
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    [15603] 1433359857.273231: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with 
result: 0/Success
    [15603] 1433359857.273259: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU: 
des-cbc-crc/D2AB
    [15603] 1433359857.273267: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU, 
session key aes256-cts/B9E9
    [15603] 1433359857.292128: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: -1765328154/Key 
version number for principal in key table is incorrect
    [15603] 1433359857.292143: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/admin at WMICH.EDU kvno 2 in 
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
    [15603] 1433359857.292248: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: 
-1765328154/Key version number for principal in key table is incorrect
    [15603] 1433359857.292258: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in 
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)

At 15:35:19 the trace for kadmind finally shows success:

    [15603] 1433360119.61281: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with 
result: 0/Success
    [15603] 1433360119.61330: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU: 
des-cbc-crc/D2AB
    [15603] 1433360119.61340: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU, 
session key aes256-cts/6F94
    [15603] 1433360119.62014: Negotiated enctype based on authenticator: aes256-cts
    [15603] 1433360119.62027: Authenticator contains subkey: aes256-cts/A5D6
    [15603] 1433360119.62080: Creating AP-REP, time 1433360119.59920, subkey aes256-cts/CE1F, seqnum 1021618218

And at 15:35:19 syslog for kadmind shows everything is nominal:

    Jun  3 15:35:19 master.dev.admin.private kadmind[15603]: Request: iprop_get_updates_1, UPDATE_NIL; Incoming SerialNo=2; Outgoing 
SerialNo=N/A, success, client=kiprop/slave.dev.admin.private at WMICH.EDU, service=kiprop/master.dev.admin.private at WMICH.EDU, 
addr=172.30.210.37


More information about the Kerberos mailing list