[Toybox] Graff and bc

Gavin Howard gavin.d.howard at gmail.com
Thu Mar 15 09:47:18 PDT 2018


Before I answer your questions, I have good news: in the midst of
fixing bugs, I was still able to reduce the line count, even though I
had to add quite a bit to fix bugs.

Non-comprehensive list of bugs fixed:

* Number printing
* String printing
* Various math bugs
* Recursion stomping on local variables
* Exit was not happening correctly

Non-comprehensive list of improvements to reduce loc:

* Removed useless BcVar and BcArray typedefs.
* Removed the split between params and autos. (I just store the number
of params now.)
* Put bc_vm_free() under CFG_TOYBOX_FREE. That also puts
bc_program_free(), bc_parse_free(), and bc_lex_free() under it, since
all of those are called in bc_vm_free().

I also improved error messages according to your internationalization section
on your website.

On Wed, Mar 14, 2018 at 8:43 PM, Rob Landley <rob at landley.net> wrote:
> On 03/13/2018 03:24 PM, Gavin Howard wrote:
>> I am about to try to implement your changes to bc_exec (and maybe
>> bc_vm_init) in my release script. While doing so, if you would like,
>> it would make sense to try to remove the BcStatus enum and replace it
>> with either #defines (chars, so I can assign right to toys.retval) or
>> something else of your choice. What would you like me to do?
>
> Do we ever care about a bc command return status other than "nonzero"? Because
> if so, you might as well just have error strings?

Technically, POSIX has no requirements here, other than returning
non-zero, but more on this later.

> Let's see...
>
> grep -o 'BC_STATUS_[_A-Z]*' toys/*/bc.c | sort | uniq -c | sort -n
>
> 6 of them have an occurrence count of 1, which means they're never used (in the
> enum, never referred to elsewhere.)

Some of those were supposed to be used. I fixed that. One was useless
for toybox. I changed the release script to remove it.

> 32 of them have an occurrence count of 2: used exactly once.
>
> 5 occur 3 times, 4 occur 4 times, 5 occur 5 times, and so on up. So several
> occur repeatedly, and I'd have to look at why...

The biggest reason there are so many is because I want better error
messages than the GNU bc, which can be cryptic.

The bc spec defines a lot of places where errors could occur, but then
it says that only two or three errors are necessary. The GNU bc seems
to have taken it at its word, but I put in an error for every single
one, in an attempt to make it easier on users to write and debug
scripts, as well as use the bc as a REPL.

> I'm still not sure the difference between LEX_EOF and PARSE_EOF, one is used 17
> times and the other 11 but you're converting one into the other in some places
> here...

I have already removed PARSE_EOF in my PR. It was legacy.

> Of course STATUS_SUCCESS is used 91 times. MALLOC_FAIL is 19 (see virtual vs
> physical memory rant)...
>
> tl;dr: I need more context and code review before forming a strong opinion about
> the right thing to do here, and if I start I'll just start chipping away at
> low-hanging fruit again. Waiting for you to come to a good stopping point first...
>
> But I _suspect_ you can probably deal with the errors without having a lookup
> table of signals between error producer and error consumer that's your own code
> on both sides. That's the kind of thing that becomes clear once enough
> overgrowth is cleaned up that the pieces get closer together.

You may be right, but I do not think so. Yet.

I chose to use a table lookup because of the requirement to print
error messages. The reason was that I wanted only one place where
errors were handled, or at least, few places. That is why I wrote the
print wrappers for error messages. And there was another reason: bc
has to choose to exit on error if not interactive, or to just display
the error message.

If I had not done that, I would have had to write something like:

if (error) bc_error("Some error message")

And in bc_error,

void bc_error(char *msg) {
  printf(msg);
  if (!bc_interactive) exit(1)
}

I mean, that's more than feasible, I admit it. And it would virtually
eliminate the 202 instances of

if (status) return status;

 But it would sprinkle error messages throughout the code. Putting
them all in one place, corresponding to a specific labeled error is
clearer in the source code, at least to me.

The other reason I used a lookup table with a error producers and an
error consumer (BcVm) was so that I could clean up properly on exit.
No you don't want to do that (see above, putting bc_vm_free() under
CFG_TOYBOX_FREE), but there are cases when it *is* necessary, as
evidenced by the very existence of CFG_TOYBOX_FREE.

Of course, I could still clean up on exit with a bc_error() like above
if I could make the one instance of BcVm global, but I don't want to
do that, and I don't think you would want me to do it either. If I am
wrong, let me know, along with your reasons. I am open to changing it
for the right reasons. One of those might be the fact that you want to
eliminate

if (status) return status;

instances. However, even in that case, I think that I would like to
still use a lookup table, to avoid sprinkling them throughout the
code.

But who knows?

Gavin Howard



More information about the Toybox mailing list