NULL ptr dereferences found with Calysto static checker

Fri Jun 15 13:14:36 EDT 2007

Ken Raeburn <raeburn at mit.edu> writes:
> 
> Date:    Fri, 15 Jun 2007 05:29:49 EDT
> To:      "Domagoj Babic" <babic.domagoj at gmail.com>
> cc:      kerberos at mit.edu
> From:    Ken Raeburn <raeburn at mit.edu>
> Subject: Re: NULL ptr dereferences found with Calysto static checker
> 
> On Jun 15, 2007, at 04:38, Domagoj Babic wrote:
> >> > int f(int a, int b) {
> >> >    int a1;
> >> >    if (a%2) { a1 = a + 1; }
> >> >    else { a1 = a; }
> >> >    int c = a1 * a1;
> >> >    assert(c % 4 == 0);
> >> > }
> >>
> >> If I felt like being picky, I'd point out that overflow in signed
> >> integer arithmetic in C produces undefined results according to the
> >> standard, so the assertion may not be met. :-)
> >
> > If it addition overflows, a1+1=0, hence 0*0%4=0. The assertion is
> > always true. Calysto and Spear preserve the properties of 2's  
> > complement
> > (bit-vector | modular arithmetic).
> 
> Actually, in two's complement with wrapping signed arithmetic, the  
> overflow case would be a=INT_MAX and a1=a+1=INT_MIN, which would  
> still be even.
> 
> But the C spec doesn't require two's complement, for historical  
> reasons, or wrapping signed arithmetic.  For unsigned types the  
> result of overflow is defined to wrap (modular arithmetic), but for  
> built-in signed types, computing a result outside the range of the  
> type can result in a trap or other behavior.  (Sorry to harp on it,  
> but lately I've spent too much time in a forum where people think  
> they can tell you exactly what behavior expressions like "i = ++i + + 
> +i" is required to produce.)
> 
>  From what I've read, modern versions of GCC have optimizations that  
> take advantage of this undefined behavior to make transformations  
> that speed up the code but don't give the wrap-on-signed-overflow  
> result.  So, actually, proving whether or not signed arithmetic can  
> overflow may also be useful...
> 
> Ken

C doesn't specify a lot of this behavior -- but: hardware was already
starting to converge on common behavior before C was big, early C
implementations followed a common envelope of behavior, and since C
was big, very few hardware implementations are going to go outside
that envelope.

So -- most hardware implementations either don't have signed integer
overflow at all, or possibly have some sort of mechanism to optionally
enable signed integer overflow, which would not be enabled in most
C programs.  Some specific examples (if I remember right): the PDP-11
and 386 don't have signed overflow.  The vax has signed overflow with
a per-subroutine enable bit (part of the save mask in the first word of
the subroutine, which is always off for C).  The 68k has conditional trap
instructions for those langauges which want to detect signed overflow.
The most interesting current example of C influencing machine design
is the sparc, which has no bit rotate instruction.  I don't know of
any practical C implementation that has signed overflow exceptions as a
matter of practice (except, uhhhm, maybe DEC VAX C???) but I'd be very
interested in hearing of one.

Other similar examples: divide by 0 usually produces a trap, but may not
on all hardware.  Modular math (%) may produce odd results on some platforms.
Shifts where the shift count is out of range may produce odd results on some
platforms, -- most commonly the count is taken mod 32 or mod 64.
Bit and byte alignment varies.  Calling parameter order evaluation varies.
Pointers and long behavior on calling parameters vary, as well as the behavior
of pointers to parameters.  The direction the stack grows in varies on a
few architectures (at&t 3b2,5,20).  On a few architectures byte alignment can
be varied in software (ppc,sparc), or stack growth direction may be entirely a
question of software calling conventions and not dictated by hardware
(ibm370 and I think also ppc).  Floating point math on older machines
particularly may have odd behavior.

More divergent examples: 32 & 64-bit word sizes with byte addressibility
is the rule.  16-bit or 8-bit is uncommon except for embedded platforms,
and a few very embedded platforms can only address 16-bit words.  36-bit
computing is rare, if not quite extinct.  1's complement arithemetic is
gone; the most notable past example of that was the IBM 7094 architecture.

A few modern machines (including i386) have special vector modes which
can do "saturated" math in place of integer overflow.  This is probably
mainly of interest in doing audio or image processing.  This is
fortunate for Domagoj Babic's overflow example.

In practical terms, i386 and x86_64 is clearly winning for general purpose
computing.  arm & powerpc survive in certain niches, and a few others aren't
quite dead yet.

	i = ++i + ++i; // ???

I think that should always produce 2i+2, although I'd hate to think
anybody would care to depend on that.  All the stuff on the right of
the = has to happen before the left, ++ *should* happen before using the
result, and order of evaluation of identical operands won't
affect the ultimate result.  Now, in
	i = i++ + --i;
the order would matter.

				-Marcus Watts