NULL ptr dereferences found with Calysto static checker

Ken Raeburn raeburn at MIT.EDU
Fri Jun 15 18:15:41 EDT 2007


Not that I really wanted to get into this here, but... :-)

On Jun 15, 2007, at 13:14, Marcus Watts wrote:
> So -- most hardware implementations either don't have signed integer
> overflow at all, or possibly have some sort of mechanism to optionally
> enable signed integer overflow, which would not be enabled in most
> C programs.  Some specific examples (if I remember right): the PDP-11
> and 386 don't have signed overflow.  The vax has signed overflow with
> a per-subroutine enable bit (part of the save mask in the first  
> word of
> the subroutine, which is always off for C).  The 68k has  
> conditional trap
> instructions for those langauges which want to detect signed overflow.
> The most interesting current example of C influencing machine design
> is the sparc, which has no bit rotate instruction.  I don't know of
> any practical C implementation that has signed overflow exceptions  
> as a
> matter of practice (except, uhhhm, maybe DEC VAX C???) but I'd be very
> interested in hearing of one.

There are people who argue that undetected signed overflow causes  
some serious and hard-to-detect bugs.  That's probably part of the  
reason for "gcc -ftrapv" being added.

But there's trapping, and there's performing optimizations based on  
the assumption that overflow won't happen because a conforming  
program won't have overflows.

> Other similar examples: divide by 0 usually produces a trap, but  
> may not
> on all hardware.  Modular math (%) may produce odd results on some  
> platforms.
> Shifts where the shift count is out of range may produce odd  
> results on some
> platforms, -- most commonly the count is taken mod 32 or mod 64.
> Bit and byte alignment varies.  Calling parameter order evaluation  
> varies.
> Pointers and long behavior on calling parameters vary, as well as  
> the behavior
> of pointers to parameters.  The direction the stack grows in varies  
> on a
> few architectures (at&t 3b2,5,20).  On a few architectures byte  
> alignment can
> be varied in software (ppc,sparc), or stack growth direction may be  
> entirely a
> question of software calling conventions and not dictated by hardware
> (ibm370 and I think also ppc).  Floating point math on older machines
> particularly may have odd behavior.

Yep, I'd like to see more of those dependencies detected by compilers  
or other analysis tools.

> More divergent examples: 32 & 64-bit word sizes with byte  
> addressibility
> is the rule.  16-bit or 8-bit is uncommon except for embedded  
> platforms,
> and a few very embedded platforms can only address 16-bit words.   
> 36-bit
> computing is rare, if not quite extinct.  1's complement  
> arithemetic is
> gone; the most notable past example of that was the IBM 7094  
> architecture.

AFAIK sign-and-magnitude is mostly dead too, aside from maybe some  
embedded systems or DSPs, but C also allows for it as well as 1's &  
2's complements.

> 	i = ++i + ++i; // ???
>
> I think that should always produce 2i+2, although I'd hate to think
> anybody would care to depend on that.

Why not 2i+3, for example -- increment i, read it (i+1), increment  
again, read it (i+2), add values (2i+3), and store?  But, no:  
"Between the previous and next sequence point an object shall have  
its stored value modified at most once by the evaluation of an  
expression.  Furthermore, the prior value shall be read only to  
determine the value to be stored."  (C '99, section 6.5.)  They even  
give "i = ++i + 1" as an example of undefined behavior.

There are reasons for this.  For example, consider "a = ++*b + + 
+*c".  For good performance, it's likely to be better to read *b and  
*c, get both reads in the pipeline as quickly as possible, and put  
off the instructions using the read values.  If you have to process  
every side effect independently and in a specific sequence, and you  
can't prove b!=c, you can't evaluate this expression nearly as  
efficiently.  But since the C standard says you can't modify the  
object twice without a sequence point, the compiler can assume b!=c  
and do the optimizations.  Similarly, there are performance reasons  
why you wouldn't want to nail down the order of function argument  
evaluation -- direction of stack growth, handling of arguments in  
registers, stuff like that.

I expect not wanting to gratuitously make existing C compilers less  
compliant with the new standard also factored into it.

Ken



More information about the Kerberos mailing list