Overall shape of Python-based test framework

Tue Mar 2 15:34:03 EST 2010

On Tue, 2010-03-02 at 03:26 -0500, Ken Raeburn wrote:
> To augment or replace dejagnu?

Replacing dejagnu is not an immediate priority, although it would reduce
headaches somewhat (specifically related to "expect" being buggy out of
the box on most machines).  The main goal is to be able to require
automated tests in new work, without requiring that developers use tcl.

> Some of the C tests are spread out that way.  Others, like the main
> dejagnu tests, may exercise a lot of different things in concert.
> Where would "test cross-realm authentication" live in the tree?

I'm not trying to kill tests/ entirely; it's still the right place for
tests which don't target a specific piece of code.

> So, Python code loading libkrb5 via an FFI is out?  I suppose, if you
> want to insulate the framework from library bugs, that makes some
> sense...

I've never had a good experience debugging through language bindings.
You can't just fire up gdb on a Python script and set a break point in a
C function that it calls through FFI.  Tests are meant to sometimes
fail, so I'd prefer to avoid technology which will make it hard to debug
those failures.

> With *lots* of high-level helper functions, I hope.  Like, "set up
> realm FOO with an LDAP database", and "exchange cross-realm keys
> between realms FOO and BAR".

The first cut will have roughly the same set of helpers as the dejagnu
test suite, plus the ability to arbitrarily adjust the krb5.conf and
kdc.conf configurations.

The necessary flexibility will be there to expand to supporting
cross-realm test scenarios, but adding the logic to construct them will
be a follow-on.

Setting up a realm with an LDAP database is a bit tough to automate,
although not impossible.  You have to find the system's OpenLDAP
installation, construct a slapd configuration containing the Kerberos
schema, possibly copy the slapd binary to evade AppArmor restrictions,
stuff like that.  It's certainly not in scope for the first cut.

> I wonder about the performance, if you're constantly setting up and
> tearing down Kerberos databases and services. We're doing some of that
> now with dejagnu, so I guess it isn't completely intolerable.

It takes about a quarter of a second to construct a test realm with what
I currently have coded up.  That includes creating user, admin, and host
principals, extracting a host keytab, firing up krb5kdc and kadmind, and
obtaining user tickets--eight commands in total.

For multi-pass tests that will add up, but I don't think it will get
intolerable.  What kills the current test suite performance is mainly
sleep()s in multi-pass tests, such as sample.exp.

> >  4. Developer adds a check-unix rule to execute the Python test
> >     script.

> Please don't confine the testing to UNIX unless the tests don't make
> sense for Windows.

This framework is for tests which need a running KDC, which we can't
currently do on Windows.  If you don't need a KDC, you can just compile
and run a C test program with no environment.

That said, I think we'll actually wind up with a PYTESTS environment
variable containing a list of test programs to run.  The work I've been
doing so far is mostly platform-agnostic (using os.path.join(), for
instance) but I expect there will be some work getting it to function
under Windows if we get to the point where that makes sense.

> I'm not sure "run it under a debugger" is necessarily the right level
> to jump to though.  It may be more helpful for the developer to be
> able to alter the command-line options, or run another program before
> the main test program, tweak environment variables, etc.  You can do
> some of it with "gdb --args", because it'll remember the program and
> arguments you supply but not force you to use exactly those arguments
> or launch the program right away, but maybe the developer wants a
> shell prompt, and maybe another debugger isn't so flexible.

What I've got so far is --debug, --stop-before, --stop-after,
--shell-before, and --shell-after.  --stop-after is particularly useful
for attaching to a daemon after it launches.

> You could do both... a test-driver script that executes a test
> indicated on the command line, or if there is none, all test scripts
> (based on filename pattern?) in the current directory, or current
> subtree, or something.

I particularly like the value of the developer being able to pluck the
command out of what "make check" runs and modify it with -v or --debug
arguments without having to know much about the test suite.  I don't
think we'll preserve that value if "make check" executes some glorified
for loop which the developer has to understand in order to narrow the
scope to a particular test.

> I'm a bit concerned that interoperability testing, both
> backwards-compatibility and mixed-implementation testing, isn't part
> of the plan; I see it as a serious weakness in our current testing.

Let's not let the perfect become the enemy of the good.  We need a way
to make it easier for "make check" to exercise more of our code.  That's
less likely to happen if we insist on supporting interoperability
testing in the same framework.

We have probably four or five man-years of housekeeping tasks on our
plate, all of them "serious weaknesses" when considered in isolation.  I
definitely want us to have automated interoperability testing, but
that's not what this project is about.

> It should be easy to plug in coverage testing, valgrind, purify,
> debugging mallocs, etc.

It should be easy enough to replicate the valgrind processing you put
into the current dejagnu test suite.