Strange behavior with kadmind and incremental propagation in 1.8.3
John Devitofranceschi
foonon at gmail.com
Mon Feb 11 19:14:43 EST 2013
I was hoping to get an opinion on this. Maybe it's something that's been seen before.
I see this in the kadmind log:
Feb 7 17:00:10 server.realm.com kadmind[17670]: [ID 702911 local0.notice] Request: kadm5_create_principal, barney at REALM.COM, Cannot lock database, client=fred at REALM.COM, service=kadmin/server.realm.com at REALM.COM, addr=1.2.3.4
The create_principal failed...we even check in our provisioning client:
Feb 7 17:00:39 server.realm.com kadmind[17670]: [ID 702911 local0.notice] Request: kadm5_get_principal, barney at REALM.COM, Principal does not exist, client=fred at REALM.COM, service=kadmin/server.realm.com at REALM.COM, addr=1.2.3.4
BUT the propagation log says this:
Update Entry
Update serial # : 150028
Update operation : Add
Update principal : barney at REALM.COM
Update size : 524
Update committed : True
Update time stamp : Thu Feb 7 17:00:00 2013
Attributes changed : 12
Attribute flags
Maximum ticket life
Maximum renewable life
Principal expiration
Password expiration
Principal
Key data
Password last changed
Modifying principal
Modification time
TL data
Length
Note the timestamps...the update entry appeared 10s earlier than the kadmind log error! There was no previous create_principal request 10s earlier.
Later on, we attempt to create the principal again with success reported this time...
Feb 7 17:00:56 server.realm.com kadmind[17670]: [ID 702911 local0.notice] Request: kadm5_create_principal, barney at REALM.COM, success, client=fred at REALM.COM, service=kadmin/server.realm.com at REALM.COM, addr=1.2.3.4
And we get another update entry (same timestamp this time):
Update Entry
Update serial # : 150030
Update operation : Add
Update principal : barney at REALM.COM
Update size : 524
Update committed : True
Update time stamp : Thu Feb 7 17:00:56 2013
Attributes changed : 12
Attribute flags
Maximum ticket life
Maximum renewable life
Principal expiration
Password expiration
Principal
Key data
Password last changed
Modifying principal
Modification time
TL data
Length
The fun really starts when the incremental propagation kicks in.
We have 7 secondary kdc's and in the case that I am investigating; 4 of them got Update 150028 right away, on its own, with no problems.
The other 3 had Update 150028 bundled with one or more additional updates and the kpropd on these three servers started core dumping over and over again until we forced a full resync.
Please let me know if this is a known bug. If it is not, maybe the code changes from 1.8 to 1.11 have made this behavior stop. It would be nice to be able to reliably reproduce this so we can then test post-1.8.3 versions of the code. So far, this has his us twice in the last couple of years.
jd
More information about the Kerberos
mailing list