NULL ptr dereferences found with Calysto static checker

Domagoj Babic babic.domagoj at gmail.com
Thu Jun 7 22:06:00 EDT 2007


Hi,

Ken, thx a lot for your feedback.

On 6/7/07, Ken Raeburn <raeburn at mit.edu> wrote:
> Thanks.  I hadn't looked at Calysto yet, but I've been wanting to get
> some static analysis tools applied to the code.

Good, so there's a mutual interest in getting it done :-)

> > + krb5-1.6/src/util/support/fake-addrinfo.c:1336
> > krb5int_getaddrinfo is a function with external linkage, which calls
> > getaddrinfo (fake-addrinfo.c:1097), passing aip as the fourth
> > parameter
> > (received as **result). Without checking whether result is NULL or
> > not,
> > getaddrinfo passes it to system_getaddrinfo (which is actually
> > getaddrinfo in netdb.h). system_getaddrinfo can set result to NULL if
> > the system is out of memory. Code at
> > krb5-1.6/src/util/support/fake-addrinfo.c:1143 dereferences result,
> > without checking it.
>
> system_getaddrinfo could set *result to NULL, not result itself,
> which was passed in by a caller.  The line after the
> system_getaddrinfo(..., result) call checks for an error return or a
> null pointer stored at *result.  If the system getaddrinfo runs out
> of memory, it should return a non-zero result, so we won't bother
> looking at what it might have put in *result.  And it's an error for
> the caller to provide a null pointer as the result argument; that's
> not something I think we need to check, any more than I'd expect
> strcpy to check for null pointers.

Thx for clarification - Calysto can't analyze the code that's not
compiled into the same module. So, to be on the safe side it has to
interpret all pointers passed to external functions as undefined. And
since getaddrinfo is external - it has no clue what it returns or how
result is initialized. Of course, this can be added as an additional
constraint, but I have done that only for the functions that most
frequently cause false positives.

> > + krb5-1.6/src/util/support/gmt_mktime.c:54 krb5int_gmt_mktime is a
> > function with external linkage, dereferences parameter t without
> > checking it.
> >
> > + krb5-1.6/src/util/support/errors.c:155 same as above, for parameter
> > ep.
>
> A lot of these functions, while exported from the support libraries,
> are considered internal functions, so there's not a lot of
> documentation, but often there are arguments that are not allowed to
> be null; if the caller passes a null, a crash is considered an
> acceptable response to this bug in the calling function.  I think all
> the functions you list in util/support are of that flavor, but I'll
> skim them a little more carefully later.  If there are places where
> these functions can get called with null pointers, that'd be more
> interesting.

Again, there's no way for a program to figure out the intentions. Having
external linkage, functions need to be considered as 'callable from
anywhere'. It seems that none of those functions even get called (at
least in my binary).

> There are functions in the public API where it'd be nice to check for
> null and return some nice error code, where we don't do that already,
> I expect.  I'm curious how Calysto works, that it reported these in
> the support library, but not some of the libkrb5 functions, like
> krb5_mk_req for example, which doesn't check that the context
> argument is non-null before using it.  (Or did you remove some of
> these already?)

Here's what Calysto does: If a function is called (no matter external or
internal), it's analyzed in all the contexts in which it is called. If
the function is external and not called anywhere, it's analyzed as one
of the roots of the callgraph.

This behaviour is trivial to change. I could easily modify Calysto to
consider any function with external linkage to be a potential root of
the call graph. However, this would produce a ton of warnings. Such a
change would have sense if:

1) You make sure that all the functions that should have internal
linkage really have internal linkage (this will reduce the number of
warnings significantly).
2) You are willing to help me out analyzing the results.


Another option is that you provide a list of functions which need to be
checked as having 'external linkage' and must be fault-tolerant.

> Ah... but krb5_mk_req isn't called by anything else in libkrb5.  And
> if you're looking across library boundaries, functions like
> krb5int_getaddrinfo are called from code in libkrb5.

Sorry, can't look across the module boundaries at the current point.
There's one way to hack around this - use llvm-link to link the produced
binaries, but for krb, I get only one binary produced anyways (libkrb).

> I'm not sure that last constraint is entirely logical.  If F is
> called from another function in the same library with a non-null
> argument, but is exported from the library and is intended to support
> a null argument as well, it'd be useful to know if it's dereferencing
> without checking.  Of course, whether the function is supposed to get
> a null pointer is less a matter of the code than of the
> specification, so heuristics are the best you're going to be able to do.

It is just a simple heuristic to decrease the number of false positives
- if the context is available, Calysto always takes it into account. As
said before, really not a big deal to change this behaviour.

> If Calysto is figuring out for itself which pointer arguments are
> required to be non-null and which are allowed to be null, I'd be
> curious to see those results.

Calysto is fully path- and context-sensitive. So it can easily figure
out that a function checks a pointer, and that it's fine to pass NULL.
However, Calysto was not build to give such statistics or OK certain
values of parameters - it has been built to find bugs. If there will be
enough interest in the future, I might add other capabilities.

Currently, Calysto checks NULL-ptr dereferencing and user-provided
assertions.

> > None of the functions mentioned above seem to be called from any
> > other function in the compiled binary (compiled with llvm-gcc
> > http://llvm.org/ ), although in the source I see that some are called
> > from the code that didn't end up in the binary for some reason.
> > Hence, Calysto assumes that those functions are library-like
> > functions.
>
> Which binary?  We build a number of them.  If the results are only
> from one of them, that might explain the missing functions.

llvm-gcc (with my linker) produces only one binary -
libkrb5.support.so.0.1.bc

How many should there be?

> If you have to pick one to analyze, the krb5kdc or kadmind binaries
> are probably best.  (But if you find any problems there that look
> like they could crash the programs and possibly be triggered
> remotely, please report them to krbcore-security at mit first, rather
> than this list, so we have a chance to put out an update before
> script kiddies start crashing KDCs everywhere...)

Will do.

> Another reason to suspect a bunch of issues have escaped your
> analysis... ;-)

The most likely culprit is the front-end - currently LLVM doesn't offer
a good support for compiling apps to bc files, which Calysto takes as
input, so I had to hack my own linker, which again does only what
llvm-gcc asks it to do. If you could list the core binaries and their
respective source directories, I could try to hack the compilation to
produce the required binaries.

Another possible approach (that I've done with sendmail) is to compile
all sources individually, and link them in a knapsack manner -
maximising the number of object files linked together without linkage
conflicts.

Regards,

-- 
        Domagoj Babic

        http://www.domagoj.info/



More information about the Kerberos mailing list