[Toybox] [landley/toybox] expr: fix memory management (#148)

enh enh at google.com
Tue Nov 12 08:07:28 PST 2019


at the same time, this patch does fix a demonstrable bug in what's
already checked in.

(though it was a mistake not to include a new test, _especially_ if
this is likely to be rewritten anyway.)

fwiw, i don't think expr is all that similar to the shell stuff. it's
a pretty small subset on the one hand, but extended on the other.
these in particular are not shell-like:

  match STRING REGEXP        same as STRING : REGEXP
  substr STRING POS LENGTH   substring of STRING, POS counted from 1
  index STRING CHARS         index in STRING where any CHARS is found, or 0
  length STRING              length of STRING
  + TOKEN                    interpret TOKEN as a string, even if it is a
                               keyword like 'match' or an operator like '/'

(and the reason i know these exist is because some of them are on my
"stuff i'd need to move AOSP to toybox expr" list :-( )

On Mon, Nov 11, 2019 at 9:33 PM Rob Landley <rob at landley.net> wrote:
>
> On 11/11/19 9:41 AM, enh-google wrote:
> > *@enh-google* commented on this pull request.
> >
> > --------------------------------------------------------------------------------
> >
> > In toys/pending/expr.c
> > <https://github.com/landley/toybox/pull/148#discussion_r344771665>:
> >
> >>
> > -  if (TT.refree) free(TT.refree);
> > +  toys.exitval = status;
> >
> > why not just set |toys.exitval| directly on L266?
>
> The reason I haven't applied this patch yet is I need 95% of this plumbing for
> toysh to implement $(( )) and (1<3) && echo hello, and what's there won't do
> that, so if I have to write a new one anyway the new one can probably also
> implement expr the command.
>
> The shell $(( )) and ( ) versions need the ability to assign environment
> variables because it turns out:
>
>   $ echo $((x=3)); echo $x
>   3
>   3
>
> And THEN there's this fun tidbit:
>
>   $ y=-4; echo $((x=y)); echo $y
>   -4
>   -4
>
> And you'll notice I didn't export y, meaning it needs to understand the
> difference between local and exported variables and thus must read shell
> context. And here's a fun one:
>
>   $ z=z
>   $ echo $((z=z))
>   bash: z: expression recursion level exceeded (error token is "z")
>
> Plus the evaluation includes shell internal variables like $_:
>
>   $ expr 1 + 2x
>   expr: non-integer argument
>   $ echo $((1+_))
>   bash: 2x: value too great for base (error token is "2x")
>
> Which means expr needs to be able to run NOFORK from toysh (not as a seperate
> process), so error_exit("non-integer argument") is out.
>
> As for the original expr.c implementation, I continue to wonder why it coverts
> "=" into enum EQ so it can have a case EQ because having the same information in
> three places that need to be kept in sync when you could just have an if
> (strcmp(s, "=")) { do the thing } staircase directly consuming non-integer
> tokens so each operator exists in one place and it just checks them in
> precedence order? And then there's the question of why have a struct with two
> representations of the same data (integer and string) when we have a function to
> convert strings to integer and back at will? Again, maybe there's a reason it
> can't do that either, but I _did_ implement a string equation parser in Java for
> quest multimedia back when I was fresh out of college and it wasn't _that_ hard.
>
> And THEN there's the bash man page's "arithmetic expression" section having x++
> and --x and ** exponentiation and bit shift operators and ? : conditionals and
> 11 different assignment operators in the %= and >>= genre...
>
> The main reason I hadn't opened the expr.c can of worms before (other than posix
> having broken the expr spec) was wondering if it should unlimited length math
> instead of 64 bit back before bc.c showed up, and every time I looked at things
> like libtommath it got brain-hurty and undocumented fast.
>
> Anyway, expr.c is in my bucket of "I don't expect I can salvage this, I may need
> to write a new one from scratch". Especially since the _logcal_ thing to do is
> put a function in lib/lib.c that consumes argv/argc inputs (because expr "1 + 2"
> prints "1 + 2" output as a string while $(( )) will parse it) which means moving
> struct sh_arg to lib/lib.h as struct argvc or something, and changing sh.c to
> use that. (Or moving the expr implementation into sh.c but I haven't done that
> with test.c and don't want another ps.c on my hands...)
>
> Every time an email about this wanders by I go "I should bump expr.c closer to
> the top of my list and just _do_ it to moot this patch", but it's _probably_
> about 3 full days to design and implement the infrastructure I actually _want_
> here and I've been kinda busy (toybox ain't my day job, nobody wants to pay for
> that) and am not scheduled to return to Austin until the 22nd...
>
> Sorry,
>
> Rob
>
> P.S. There's a nonzero chance I've written basically this email about expr.c at
> least twice before. Yup, http://landley.net/notes-2016.html#19-04-2016 is one.
> And I'm aware it's been in pending forever
> (http://lists.landley.net/pipermail/toybox-landley.net/2013-June/005975.html is
> from 2013). See David Graeber's example of the guy whose job is to apologize for
> the carpenter not having gotten to your todo item yet, I'm aware it indicates a
> structural problem with toybox development, but this isn't "paint my fence" this
> is halfway to https://en.wikipedia.org/wiki/Mechanical_watch territory and
> involves designing the pieces that need to be fit into the space, and
> _meanwhile_ I'm off working on j-core and gps stuff again because they pay my
> bills and this doesn't so I haven't even looked at toysh for a couple 2 weeks now.
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net



More information about the Toybox mailing list