[Toybox] [PATCH] Fix the operator precedence in expr.

Andy Chu andychup at gmail.com
Sat Mar 19 23:32:14 PDT 2016


Rob, not sure if you're done with this yet, but the strategy for
freeing here doesn't quite work:

https://github.com/landley/toybox/commit/0ec95b7c2e20cd7be33bae6adba20bf89c5f3e86

I thought about exactly this: just free in eval_op().  But there are a
few problems -- for example:

1) re() allocates a string for the match
2) You free it in eval_op() free(v->s).  You still have a struct
value, but it has an invalid pointer.
3) You use the returned string for another op

Basically something like:

expr foo : '\(.+\)'  == foo

There are other possible bugs with that scheme too, e.g. involving
coercion.  ASAN easily flags the memory leaks and regex tests fail.
With the strategy I sent, it doesn't detect any memory leaks.

(I will send a patch for adding make asantest, make asantest_expr,
etc. which is on top of my test harness fixes.)

-----

Also, while your mind is on expr, and since you mentioned $(()) in
bash ... I happened to look at the mksh source code (Android shell),
and it uses the same algorithm for their $(()).  This is in contrast
coreutils and busybox expr which use recursive descent.

https://github.com/MirBSD/mksh/blob/master/expr.c#L404  (loop that
breaks on precedence)

I guess the main difference is that there are unary operators there
like -- and ++, and a ternary ? : operator.  I guess it's not much
info but I thought it was interesting! :)

Andy


On Wed, Mar 16, 2016 at 6:20 PM, Andy Chu <andychup at gmail.com> wrote:
>> I applied your two patches and I've done one round of cleanup on top of
>> them so far. Then I started replacing "TT.tok = toys.optargs++" with
>> directly incrementing TT.tok and I introduced a bug with parentheses
>> management and got myself confused and went to do something else for a
>> while and haven't gotten back to it yet.
>
> Yeah it's a little fiddly, but elegant when you get used to the
> algorithm's logic.  To fix the misleading error message on expr \( \),
> try this.
>
> +   if (!strcmp(TT.tok, ")")) {      // an atom can't start with )
> +     error_exit("Unexpected )");
> -   if (!strcmp(TT.tok, "(")) { // parenthesized expression
> +   } else if (!strcmp(TT.tok, "(")) { // parenthesized expression
>
> The entry into eval_expr() expects an "atom", which is either
> something like "1" or "a" or a parenthesized expression like ( 1 + 2
> ).
>
> Note that expr \* is valid -- it treats it like a literal string.  So
> expr \) was ALSO valid.
>
> expr \( \) \) was valid  -- it is a string inside parentheses.  Thus
> expr \( \)  was being parsed as OPEN PAREN, STRING, and then ERROR,
> expected closing \).  So it was technically correct, though very
> confusing  This tiny patch will disallow things like expr \) and expr
> \( \) \).
>
> I actually had this check in my original version, but removed it
> because of minimalism, but didn't realize that confusing case.
>
>>> I wanted to bring up the one TODO in the patches... syntax_error()
>>> defaults to printing "syntax error", but the caller passes a more
>>> detailed error message, which can be enabled with "if (1)".
>>
>> This morning's cleanup pass replaced the calls to syntax_error() with
>> calls to error_exit(), making it always print the more detailed error
>> message.
>
> OK awesome!  Makes sense.
>
>>> That would make it clearer that + is both a unary and binary operator,
>>> while - is only a binary operator.
>>
>> So my help text is wrong and not _all_ operators are infix.
>>
>> Sigh...
>
> No your help is correct!  I didn't implement what GNU expr does (nor
> did existing code).  It "puns" + as a unary escaping operator (in
> addition to infix addition).
>
> I wrote about this in my message about find / expr / test.
>
> It's because GNU has "match" / "substr" operators, which we don't
> have.  This is the problem:
>
> $ expr foo = foo
> 1
>
> $ expr match = match
> expr: syntax error  # it's expecting arguments to the operator "match"
>
> $ expr + match = + match  # escape match operator as string
> 1
>
> I think the command is OK as is -- it's POSIX conformant... It's
> possible we could run into something in Aboriginal LInux that uses the
> match / substr operators, in which case we could add them and the
> annoying + unary escape.
>
> Andy

 1458455534.0


More information about the Toybox mailing list