[Toybox] [PATCH] expr

Daniel Verkamp daniel at drv.nu
Wed Jun 12 20:56:00 PDT 2013


On Wed, Jun 12, 2013 at 7:43 PM, Rob Landley <rob at landley.net> wrote:
> On 06/10/2013 07:56:17 PM, Daniel Verkamp wrote:
>>
>> On Sat, Jun 8, 2013 at 11:22 AM, Rob Landley <rob at landley.net> wrote:
>> > On 06/05/2013 07:29:38 PM, Daniel Verkamp wrote:
>> >>
>> >> On Tue, Jun 4, 2013 at 11:06 PM, Rob Landley <rob at landley.net> wrote:
>> >> >> - Decide what to do about integer overflow (the current code can
>> >> >> execute undefined signed overflow behavior with large inputs; GNU
>> >> >> coreutils expr detects this and prints an appropriate error).
>> >> >
>> >> > What does Posix require?
>> >>
>> >> The expr command description doesn't mention anything about integer
>> >> overflow at all; I don't know if there is some overall POSIX
>> >> requirement that applies.
>> >
>> >
>> > Well, how did _you_ find out about it?
>>
>> The C standard says signed integer overflow is undefined behavior, and
>> a user can trigger such behavior with crafted inputs.  Try `expr
>> 9223372036854775807 \* 2`, for example.
>
>
> So you've never actually seen any real-world anything be inconvenienced by
> this, you're just writing defensive code against a hypothetical the standard
> doesn't require you to handle.

Yep, I brought it up purely from a language-lawyer perspective, and to
avoid someone coming along later and saying "these fools never thought
about integer overflow!"

> By the way, how would you test for overflow? There's no feraiseexcept() in
> integer math that I'm aware of...

There's no nice built-in way to do it; you pretty much have to do the
overflow calculations by hand (e.g. for addition, you can do the math
in unsigned integers, which have defined overflow behavior, and
compare the sign bits of the two addends and the result).  It's not
pretty.

>> Additionally, the GNU Coreutils implementation of expr prints a
>> specific message on overflow.
>
>
> Bash is also a GNU program:
>
> landley at driftwood:~/busybox/busybox$ echo
> $((1000000*1000000*1000000*1000000))
> 2003764205206896640
> landley at driftwood:~/busybox/busybox$ help | head -n 1
> GNU bash, version 4.2.25(1)-release (x86_64-pc-linux-gnu)
>
> The FSF is inconsistent (insane). They write autoconf tests that check the
> sizeof uint32_t. Yes really, I was digging up some old autoconf complaints
> recently because somebody was insane enough to try to defend it:
>
>   http://landley.net/notes-2010.html#20-08-2010
>   http://landley.net/notes-2010.html#09-08-2010
>   http://landley.net/notes-2011.html#26-10-2011
>   http://landley.net/notes-2011.html#06-09-2011
>
> The reason toybox exists (and busybox before it) is because the GNU versions
> are overengineered paranoid crap, punitively licensed. "GNU does it" is
> never by itself a reason for anything.

I hear you.  I only brought up the GNU version since it's probably the
only version that any Linux-related scripts have been tested with.

As an example (unrelated to overflow), I grepped through the Linux
kernel source tree and found some uses of the non-POSIX "match" and
"index" operators (e.g. scripts/decodecode), so they're obviously not
too concerned about using behavior outside the standard.

>> > Absent the standard explicitly requiring something, the next thing I
>> > care
>> > about is real world users. Who will _not_ doing it inconvenience?
>>
>> At the very least, it's within the rights of the compiler to generate
>> code that aborts or crashes on overflow, and the potential for nasal
>> demons (or at least incorrect results) is probably enough to warrant
>> fixing it.
>
>
> Once the nasal demons get typecast to a 64 bit integer they're unlikely to
> go far. And any C environment that crashes on integer overflow can't run the
> Linux kernel, so I'm not sure we care? (They intentionally initialize the
> timer ticks to overflow around 120 seconds after boot, to force everything
> to handle timer wraparound properly.)

Well, that's probably *unsigned* integer overflow, which is defined in
the C standard to wrap around.  In expr's case, we're talking about
signed integer overflow, which is what's undefined in C.

Either way, I don't know of any real machines that can run Linux and
actually crash on integer overflow (though I am only really familiar
with x86).

> Ok, now I'm curious, grep -i overflow in busybox/coreutils/expr.c:
>
>                 /* Don't interpret the empty string as an integer.  */
>                 /* Currently does not worry about overflow or int/long
> differences. */
>                 i = STRTOL(v->u.s, &e, 10);
>
> Code has been in the tree for:
>
> commit 1b355ebba68bdd567dd3961a18291dfd9532c2e8
> Author: Eric Andersen <andersen at codepoet.org>
> Date:   Tue Sep 5 17:37:48 2000 +0000
>
>     Added expr, from Edward Betts <edward at debian.org>, with some fixups
>     and docs added by me.
>      -Erik
>
> Coming up on 13 years. And the closest anybody came to complaining about
> "overflow" was:
>
> commit a7f4e4bbd8d7a47a49404d28bc07ab3b5dc1c19b
> Author: Denis Vlasenko <vda.linux at googlemail.com>
> Date:   Wed Apr 2 20:24:09 2008 +0000
>
>     expr: fix comparisons 'a < b' where we were overflowing a-b
>     (not to mention that we used int, not arith_t). closes bug 2744.
>
> Not because they cared about the result of math that overflowed, but because
> comparison was producing wrong results on numbers within the storable range.
>
>
>> I presume most real-world uses of expr don't cause overflow, since it
>> doesn't produce any useful results, but I suppose there could be
>> scripts that depend on the "return error on overflow" behavior of GNU
>> expr in some way.
>
>
> There could be scripts that depend on non-posix behavior, yes. And until I
> see an example of one, I can't judge whether they should fix their broken
> code or we should humor them.
>
>
>> Perhaps the easiest option would be to enable GCC's -ftrapv option,
>> which causes signed overflow to abort intentionally;
>
>
> Hell no.
>
> I'm confused: you list as a potential downside that expr might crash. So the
> solutions are to add an error_exit() to force expr to crash, or to add a
> compiler flag to force expr to crash. And this improves matters how?

Only better in that it will definitely abort rather than doing some
other undefined thing; this is not better than crashing because of
undefined behavior, but at least it would be consistent.
I mostly added this suggestion as a "technically correct" option.

>> this would not be
>> very user friendly, but it's probably better than generating wrong
>> answers.
>
>
> Garbage in, garbage out. They get to keep the pieces.
>
> A couple posts back you were ok with not always doing 64 bit math, which is
> easy and known to help produce more right answers. Now you're advocating for
> something that isn't necessarily easy (no feraisexcept() for integer math),
> doesn't help produce more right answers.

That's a fair point.  The main reason I used 'long' (rather than
forcing 64-bit math) is that POSIX seemed to indicate that expr need
not support more than the precision of the C 'long' type, and since
64-bit math presumably generates bigger/slower code on 32-bit targets.

> In a larger context, the problem you want to address is that on overflow it
> produces incorrect output (because C math does), so you'd like it to produce
> a different kind of incorrect output, on the theory that somebody is going
> to check the return code of expr in a script.

It's definitely a corner case, and the corner is hidden behind some
cobwebs under a staircase in an abandoned house on a hill.

[...]
>
> I'm not saying I'm against handling it, I'm saying you have yet to make a
> compelling case for it.
>
> I note that I probably need to add an arbitrary precision math library to
> implement bc, which is in posix and required to build an unmodified linux
> kernel, so if we're GOING to care about overflow we could always just avoid
> overflowing.
>
> But until them, overflow doesn't really bother me if it doesn't bother
> posix, busybox, or bash.

I doubt there is a compelling case for it from a user's point of view,
and I'm in agreement that it is probably not worth doing anything
about.  I just wanted to be sure there was no false expectation that
this expr implementation did anything useful on overflow, and it
sounds like we've definitely got that covered now. :)

Please find attached the (entirely trivial) patch for 64-bit integers
in expr, which pretty much avoids this problem for all sane users.

> Rob

Thanks,
-- Daniel Verkamp
-------------- next part --------------
A non-text attachment was scrubbed...
Name: expr-64bit.patch
Type: application/octet-stream
Size: 1062 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20130612/434c946b/attachment-0004.obj>


More information about the Toybox mailing list