Erratic behavior of full resync process
Leonard J. Peirce
leonard.peirce+kerberos at wmich.edu
Thu Jun 4 15:24:18 EDT 2015
[[NOTE: I originally posted this reply to comp.protocols.kerberos but it
doesn't appear to have made it to the mailling list. That'll teach me. :-)]]
I continue to have the same problems (unfortunately I was unable
to participate in the latest teleconference). I have found that I
get the same results on CentOS 6.6 running in a VM or on actual hardware.
And since I'm getting the same results with the standard CentOS RPMs
or the latest release (1.13.2) built from source I've settled on
testing with the latest release on physical hardware.
While trying to isolate the problem I've encountered a scenario that
is somewhat strange. The master and slave were in sync when I stopped
all daemons on both servers. I thought that upon restarting them
(without loading the database with kdb5_util or making any updates
in kadmin) kpropd and kadmind would connect and quickly figure out
everything was in sync and everything would be nominal. But kpropd
failed to initialize and connect for some 100 minutes.
Given that previously kpropd did connect and the slave had been
initialized why would just a restart of the daemons cause kpropd to
fail to initialize so many times and then, suddenly, succeed?
Below is syslog and trace information (from KRB5_TRACE) for both
kadmind and kpropd as well as a few comments. A summary:
13:50:38 -- kadmind starts on master
13:51:39 -- kpropd starts on slave; for the next 1 hour 40 minutes kpropd repeatedly
tries and fails to initialize until 15:30:19 when syslog entries stop
15:30:57 -- kadmind complains in syslog about clock skew
15:30:57 -- trace for kadmind is finally updated, showing failure, claiming it cannot
find key for kadmin/admin at WMICH.EDU kvno 2 in keytab
15:35:19 -- trace for kadmind finally shows success
15:35:19 -- syslog for kadmind shows everything nominal
Stop here if you're not interested in the gory details of syslog and the trace
information. My question is still the same: is anyone else seeing anything
like this? My next step is to try this on CentOS 7.
--
Leonard J. Peirce
Western Michigan University
Office of Information Technology
Kalamazoo, MI 49008
At 13:50:38 kadmind starts on master. kadmind syslog at startup:
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: setting up network...
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 10: udp 0.0.0.0.464 (pktinfo)
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(11,IPV6_V6ONLY,1) worked
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 11: udp ::.464 (pktinfo)
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(12,IPV6_V6ONLY,1) worked
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 13: tcp 0.0.0.0.464
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 12: tcp ::.464
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 14: rpc 0.0.0.0.2088
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(15,IPV6_V6ONLY,1) worked
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 15: rpc ::.2088
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 16: rpc 0.0.0.0.749
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: setsockopt(17,IPV6_V6ONLY,1) worked
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: listening on fd 17: rpc ::.749
Jun 3 13:50:38 master.dev.admin.private kadmind[15602]: set up 8 sockets
Jun 3 13:50:38 master.dev.admin.private kadmind[15603]: Seeding random number generator
Trace output of kadmind on startup:
[15602] 1433353838.676525: Retrieving K/M at WMICH.EDU from FILE:/var/kerberos/krb5kdc/.k5.WMICH.EDU (vno 0, enctype 0) with
result: 0/Success
[15602] 1433353838.774319: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.774525: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.774704: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.774895: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.775082: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.775258: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.775435: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.775611: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.775802: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.775992: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.776167: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.776343: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.776525: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.776699: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.776885: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
[15602] 1433353838.777063: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 0, enctype 0) with result: 0/Success
At 13:51:39 kpropd starts on slave.
kpropd syslog at startup. The number of failures from kpropd varies.
Sometimes it is only a few, sometimes the failures will continue for hours:
Jun 3 13:52:34 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing
/opt/krb5/sbin/kpropd interface, retrying
Jun 3 13:53:33 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing
/opt/krb5/sbin/kpropd interface, retrying
Jun 3 13:54:36 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing
/opt/krb5/sbin/kpropd interface, retrying
Jun 3 13:55:47 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing
/opt/krb5/sbin/kpropd interface, retrying
Jun 3 13:57:15 slave.dev.admin.private kpropd[2296]: /opt/krb5/sbin/kpropd: GSS-API (or Kerberos) error while initializing
/opt/krb5/sbin/kpropd interface, retrying
trace output of kpropd; this pattern of log messages repeats for each failure of
kpropd to initialize:
[2296] 1433353899.702011: Initializing MEMORY:kadm5_0 with default princ kiprop/slave.dev.admin.private at WMICH.EDU
[2296] 1433353899.702053: Getting initial credentials for kiprop/slave.dev.admin.private at WMICH.EDU
[2296] 1433353899.702212: Setting initial creds service to kiprop/master.dev.admin.private
[2296] 1433353899.702278: Looked up etypes in keytab: des-cbc-crc, des3-cbc-sha1, aes256-cts, aes128-cts, rc4-hmac
[2296] 1433353899.702342: Sending request (235 bytes) to WMICH.EDU
[2296] 1433353899.702377: Resolving hostname slave.dev.admin.private
[2296] 1433353899.702862: Sending initial UDP request to dgram 172.30.210.37:88
[2296] 1433353899.703396: Received answer (880 bytes) from dgram 172.30.210.37:88
[2296] 1433353899.703420: Response was not from master KDC
[2296] 1433353899.703443: Processing preauth types: 19
[2296] 1433353899.703450: Selected etype info: etype aes256-cts, salt "WMICH.EDUkipropslave.dev.admin.private", params ""
[2296] 1433353899.703457: Produced preauth for next request: (empty)
[2296] 1433353899.703461: Getting AS key, salt "WMICH.EDUkipropslave.dev.admin.private", params ""
[2296] 1433353899.703500: Retrieving kiprop/slave.dev.admin.private at WMICH.EDU from FILE:/etc/krb5.keytab (vno 0, enctype
aes256-cts) with result: 0/Success
[2296] 1433353899.703515: AS key obtained from gak_fct: aes256-cts/773C
[2296] 1433353899.703557: Decrypted AS reply; session key is: aes256-cts/9602
[2296] 1433353899.703574: FAST negotiation: available
[2296] 1433353899.703588: Initializing MEMORY:kadm5_0 with default princ kiprop/slave.dev.admin.private at WMICH.EDU
[2296] 1433353899.703598: Storing kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU in
MEMORY:kadm5_0
[2296] 1433353899.703610: Storing config in MEMORY:kadm5_0 for kiprop/master.dev.admin.private at WMICH.EDU: fast_avail: yes
[2296] 1433353899.703624: Storing kiprop/slave.dev.admin.private at WMICH.EDU ->
krb5_ccache_conf_data/fast_avail/kiprop\/master.dev.admin.private\@WMICH.EDU at X-CACHECONF: in MEMORY:kadm5_0
[2296] 1433353899.704390: Getting credentials kiprop/slave.dev.admin.private at WMICH.EDU ->
kiprop/master.dev.admin.private at WMICH.EDU using ccache MEMORY:kadm5_0
[2296] 1433353899.704414: Retrieving kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU from
MEMORY:kadm5_0 with result: 0/Success
[2296] 1433353899.704454: Creating authenticator for kiprop/slave.dev.admin.private at WMICH.EDU ->
kiprop/master.dev.admin.private at WMICH.EDU, seqnum 108307488, subkey aes256-cts/3CC8, session key aes256-cts/9602
[2296] 1433353924.730934: Getting credentials kiprop/slave.dev.admin.private at WMICH.EDU ->
kiprop/master.dev.admin.private at WMICH.EDU using ccache MEMORY:kadm5_0
[2296] 1433353924.730974: Retrieving kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU from
MEMORY:kadm5_0 with result: 0/Success
[2296] 1433353924.731019: Creating authenticator for kiprop/slave.dev.admin.private at WMICH.EDU ->
kiprop/master.dev.admin.private at WMICH.EDU, seqnum 535019415, subkey aes256-cts/C077, session key aes256-cts/9602
At 15:30:57 (100 minutes after initial startup), the kadmind complains in syslog
about clock skew even though both master and slave are running NTP:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: starting
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: 172.30.210.37, GSS-API error strings are:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Unspecified GSS failure. Minor code may provide more information
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Clock skew too great
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: GSS-API error strings complete.
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: (unknown), GSS-API error strings are:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Unspecified GSS failure. Minor code may provide more information
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in keytab
(request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: GSS-API error strings complete.
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: closing down fd 35
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: 172.30.210.37, GSS-API error strings are:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Unspecified GSS failure. Minor code may provide more information
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Clock skew too great
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: GSS-API error strings complete.
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: (unknown), GSS-API error strings are:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Unspecified GSS failure. Minor code may provide more information
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in keytab
(request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: GSS-API error strings complete.
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: closing down fd 35
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: 172.30.210.37, GSS-API error strings are:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Unspecified GSS failure. Minor code may provide more information
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Clock skew too great
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: GSS-API error strings complete.
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Authentication attempt failed: (unknown), GSS-API error strings are:
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Unspecified GSS failure. Minor code may provide more information
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in keytab
(request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: GSS-API error strings complete.
Jun 3 15:30:57 master.dev.admin.private kadmind[15603]: closing down fd 35
kpropd continues to fail and the above syslog entries repeating until
15:30:19 (1 hour and 40 minutes later) when syslog for kpropd goes silent.
At 15:30:57 the trace for kadmind is finally updated, showing failure:
[15603] 1433359857.235560: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with
result: 0/Success
[15603] 1433359857.235623: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU:
des-cbc-crc/D2AB
[15603] 1433359857.235635: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU,
session key aes256-cts/9602
[15603] 1433359857.253258: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: -1765328154/Key
version number for principal in key table is incorrect
[15603] 1433359857.253277: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/admin at WMICH.EDU kvno 2 in
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
[15603] 1433359857.253376: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result:
-1765328154/Key version number for principal in key table is incorrect
[15603] 1433359857.253387: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
[15603] 1433359857.253945: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with
result: 0/Success
[15603] 1433359857.253972: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU:
des-cbc-crc/D2AB
[15603] 1433359857.253980: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU,
session key aes256-cts/DAEB
[15603] 1433359857.272560: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: -1765328154/Key
version number for principal in key table is incorrect
[15603] 1433359857.272576: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/admin at WMICH.EDU kvno 2 in
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
[15603] 1433359857.272689: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result:
-1765328154/Key version number for principal in key table is incorrect
[15603] 1433359857.272706: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
[15603] 1433359857.273231: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with
result: 0/Success
[15603] 1433359857.273259: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU:
des-cbc-crc/D2AB
[15603] 1433359857.273267: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU,
session key aes256-cts/B9E9
[15603] 1433359857.292128: Retrieving kadmin/admin at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result: -1765328154/Key
version number for principal in key table is incorrect
[15603] 1433359857.292143: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/admin at WMICH.EDU kvno 2 in
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
[15603] 1433359857.292248: Retrieving kadmin/changepw at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with result:
-1765328154/Key version number for principal in key table is incorrect
[15603] 1433359857.292258: Failed to decrypt AP-REQ ticket: -1765328349/Cannot find key for kadmin/changepw at WMICH.EDU kvno 2 in
keytab (request ticket server kiprop/master.dev.admin.private at WMICH.EDU)
At 15:35:19 the trace for kadmind finally shows success:
[15603] 1433360119.61281: Retrieving kiprop/master.dev.admin.private at WMICH.EDU from KDB: (vno 2, enctype des-cbc-crc) with
result: 0/Success
[15603] 1433360119.61330: Decrypted AP-REQ with specified server principal kiprop/master.dev.admin.private at WMICH.EDU:
des-cbc-crc/D2AB
[15603] 1433360119.61340: AP-REQ ticket: kiprop/slave.dev.admin.private at WMICH.EDU -> kiprop/master.dev.admin.private at WMICH.EDU,
session key aes256-cts/6F94
[15603] 1433360119.62014: Negotiated enctype based on authenticator: aes256-cts
[15603] 1433360119.62027: Authenticator contains subkey: aes256-cts/A5D6
[15603] 1433360119.62080: Creating AP-REP, time 1433360119.59920, subkey aes256-cts/CE1F, seqnum 1021618218
And at 15:35:19 syslog for kadmind shows everything is nominal:
Jun 3 15:35:19 master.dev.admin.private kadmind[15603]: Request: iprop_get_updates_1, UPDATE_NIL; Incoming SerialNo=2; Outgoing
SerialNo=N/A, success, client=kiprop/slave.dev.admin.private at WMICH.EDU, service=kiprop/master.dev.admin.private at WMICH.EDU,
addr=172.30.210.37
More information about the Kerberos
mailing list