[krbdev.mit.edu #7695] krb5-1.11.3/1.10.6 - full resync may fail and still result in ulog being updated

Wed Aug 28 18:48:18 EDT 2013

My concern was within restore_dump()... it calls krb5_db_put_principal()
which in turn calls ulog_lock for each record (this was a major performance
hit especially with a database with about a half-million principals). Under
Kerberos 1.10, I found it only took 2-3 minutes to load the database (and
another 2-3 to send it over the wire), but I was seeing inordinately long
load times which I could trace back to the ulog_lock/unlock for every
record. I will have to review the master branch carefully, but if that code
also exists in the master branch, see how long it takes to do a dump/restore
for a database with about 500,000 principals (and compare it with the time
it took under Kerberos 1.10.x).

And I agree with you that I might have introduced the performance issue with
my first patch (when I moved the iproprole setting, which is why I set it to
IPROP_NULL as it was in the original state before I patched it).

I am confused how you say the third patch is broken, as by this point, the
ulog is mapped and I am setting the ulog at the end of the restore. During
the restore, you do not want to be touching the ulog at all (e.g.
IPROP_NULL) because the database being restored is "on the side". Once it is
promoted, one might argue the first thing should be to set it to
IPROP_SLAVE, then set the information, but that is not what was in the code
originally either.

The goal should be not to allow any ulog processing or locking during the
restore (which is to a temp db anyway), and then set the ulog state after
the restore.

I will play around with the master branch (what I supplied was based on the
1.11 branch and empirical testing, but the two branches have deviated enough
at this point).

-----Original Message-----
From: Greg Hudson via RT [mailto:rt-comment at krbdev.mit.edu] 
Sent: Wednesday, August 28, 2013 6:07 PM
To: basch at alum.mit.edu
Subject: [krbdev.mit.edu #7695] krb5-1.11.3/1.10.6 - full resync may fail
and still result in ulog being updated 

Please send follow-ups by replying (without quoting) to a message you 
receive from rt at krbdev.mit.edu, not by forwarding them to 
krbdev at mit.edu.  We've been scooping these follow-ups out of the krbdev 
moderation queue and appending them to the ticket.

I believe the performance problem you noted was actually introduced by 
your first patch.  If you look at the code you moved in that patch, 
you'll see that one of the things it does is set log_ctx->iproprole = 
IPROP_NULL before restore_dump().  The other part of the third patch 
should be unnecessary, because ulog_lock() does nothing if iproprole is 
IPROP_NULL.

Your third patch also seems broken, in that setting log_ctx->iproprole = 
IPROP_NULL before restore_dump() will prevent the ulog from being 
updated afterwards.  On master 
(https://github.com/krb5/krb5/commit/825fa2be6f119677a09acccb109ab976cfc
601f8) I worked around this by setting iproprole to IPROP_SLAVE instead 
of IPROP_NULL, but that only works because in master, kdb5.c only calls 
ulog_lock() if iproprole is IPROP_MASTER.