issue with temp policy lock file and kdb5_util error
Will Fiveash
will.fiveash at oracle.com
Mon Nov 1 14:04:26 EDT 2010
On Mon, Nov 01, 2010 at 01:00:34PM -0400, Greg Hudson wrote:
> On Fri, 2010-09-24 at 15:09 -0400, Will Fiveash wrote:
> > Recently someone was having issues with kprop working and the problem
> > appears to be related kpropd aborting when processing a resync with the
> > master KDC and leaving temp principal files behind like:
> >
> > loki# ls princ*
> > principal.ulog principal~.kadm5 principal~.ok
> > principal~ principal~.kadm5.lock
>
> I have reproduced this issue by adding "if (getenv("CRASH") != NULL)
> abort();" to dump.c after the database is opened. The failure is that
> subsequent kdb5_util load invocations fail in krb5_db2_create() because
> check_openable() returns true. The error message seen is "kdb5_util:
> File exists".
>
> At the moment, an admin can recover from this situation by running "rm
> principal~*". There is no higher-level command to perform that.
>
> There are a few approaches we could take to improving kdb5_util load's
> behavior after an aborted load. The best would be to make "kdb5_util
> load" remove the temp DB if one exists as a remnant of an aborted load.
I agree the best solution is one that doesn't require an admin to take
additional action to clean up an aborted load.
> However, we'd want to preserve the property that if two "kdb5_util load"
> processes overlap, the second one fails and doesn't interfere with the
> first one.
>
> Right now the sequence of operations is:
>
> 1. Create the temp DB (fail if it exists).
> 2. Lock the temp DB.
> 3. Load the dump file into the temp DB.
> 4. Unlock the temp DB.
> 5. Promote the temp DB.
>
> Steps 2 and 4 of this sequence are not especially useful. There is a
> race condition where two kdb5_util load operations can both succeed in
> creating the temp DB, but the locking does not generally resolve that
> race condition into correct behavior.
>
> To make this sequence work properly, we need an atomic DAL operation to
> create a locked temp DB, overwriting the existing one if an unlocked one
> exists. (This can be an alteration of the contract for the existing API
> for creating a temp DB.) Likewise, krb5_db_promote() needs to require
> that the temp DB is already locked, and release the lock after the temp
> DB is promoted.
>
> I estimate this at 2-5 days of work. For now I will open a ticket (or
> annotate an existing one if I find it), and I'll see about allocating
> the necessary time.
Sounds good to me, thanks for looking into this.
--
Will Fiveash
Oracle
http://opensolaris.org/os/project/kerberos/
Sent using mutt, a sweet, text based e-mail app <http://www.mutt.org/>
More information about the krbdev
mailing list