[Toybox] Graff and bc

Rob Landley rob at landley.net
Mon Mar 12 15:21:47 PDT 2018


On 03/12/2018 03:43 PM, Gavin Howard wrote:
> Rob and all,
> 
> I am the gdh that Christopher Graff was referring to
> here: http://lists.landley.net/pipermail/toybox-landley.net/2018-March/009413.html.
> I have a few notes.

I am sympathetic to the drama happening off-list, but I have plenty of my own
thanks. I got a code submission that looks workable. My choices are to take it
or not to take it.

> First, Graff's library was incompatible with the POSIX bc spec in subtle ways,
> and I was getting frustrated trying to work with him. So on a day that I was
> frustrated, I attempted to write an implementation to parse, print, and add.
> After discovering that I could do it, I wrote my own.
> 
> Second, by the time you had sent me the link to the library, I had already
> completed my own. Also, POSIX requires decimal, while that library is for
> binary. The rounding modes are also incompatible. That's why I cannot use it.
> 
> Third, in the middle of your message to Graff, you said that, because it's so
> big, you would change my submission to be unrecognizable before accepting it.

It's already in, merged and pushed, from your first pull request. (Merged it
yesterday, didn't push until today.)

> I understand that it's big, but please consider the fact that I implemented a
> complete programming language and virtual machine in 9000 lines of code. And I
> did it with comprehensive error checking. Asking for much less is (in my
> opinion) not entirely fair.

I intend to do a series of cleanup passes, as I generally do to every command in
pending. I intend to check them in one at a time, in reasonably bite-sized
chunks, and write up a description of what I did each time. Just like the series
linked to on https://landley.net/toybox/cleanup.html

Some of it's coding style issues:

$ grep typedef toys/pending/bc.c | wc
     34     135    1129
$ grep typedef toys/other/*.c | wc
     0      0      0

And some is because I expect I honestly can make it smaller and simpler.

> Now, there are some blank lines that I could reduce. (For example, I could
> remove blank lines between function calls and the line that checks for errors.)
> However, I have been very careful to not put in any more functional loc than was
> absolutely required. A big portion of what was required was to implement GNU
> extensions, the vast majority of which are required for the timeconst.bc script
> in the Linux kernel.

I'm not complaining! It's a good bc submission. It'll take me a while just ot
review it. However, when I did "make bc" I got a lot of:

toys/pending/bc.c: In function 'bc_vm_execStdin':
toys/pending/bc.c:8945:7: error: 'for' loop initial declarations are only
allowed in C99 mode
       for (uint32_t i = 0; i < len; ++i) {

So I already locally have changes in my tree just to get it to compile, because
the compiler on ubuntu 14.04 won't let it declare variables in a for loop by
default with the set of flags I'm feeding it, and although c99 allows variable
declarations anywhere the code style has 'em at the top of blocks so rather than
fix the build break with a command line flag I fixed it by changing the code to
adhere to the toybox coding style.

Similarly, toybox uses xmalloc() and xrealloc() and xstrdup() and such because
if malloc() and friends every return NULL it means you've exhausted _virtual_
address space. Physical memory exhaustion manifests as either the OOM killer
getting called asynchronously and killing your process, or the system swap
thrashing itself to death and freezing hard. (There are things like strict
overcommit and nommu that can lead to allocation failures long before the system
is anywhere near actually out of memory, but neither is the common case and
nommu fragmentation case isn't particularly recoverable.) So toybox's policy is
to use an xmalloc() wrapper that exits with an error message if malloc fails,
because it's an unrecoverable situation. Your code instead does a lot of:

#define BC_PROGRAM_BUF_SIZE (1024)

  buffer = malloc(BC_PROGRAM_BUF_SIZE + 1);

  if (!buffer) return BC_STATUS_MALLOC_FAIL;

Which isn't how toybox does stuff anywhere else.

It's possible some allocations should be considered recoverable, perhaps
allocating a vector of an unreasonably large size should fail instead of dying.
But _malloc_ isn't going to reliably tell you there's a problem. Unless you're
allocating multiple gigabytes on 32 bit (and basically _never_ on 64 bit),
malloc isn't what will fail. Running bc on a system with 512 megs of ram after
allocating 1.5 gigs of vector will demonstrate a problem when you dirty the
pages and it runs out of freeable pages to fault in, long after malloc's
returned. In order to tell that an allocation will cause a problem later, you
basically have to query free memory yourself. (Or enable strict overcommit,
which will basically false positive the system to death. Android's approach
seems to be a cross between the two, killing apps from their own userspace oom
manager to prevent the system from coming _near_ memory exhaustion so things
like video recording and audio playback never drop frames. They may suddenly
_stop_, but they won't have subtle dropouts. But that's from an ELC presentation
a few years ago and may have changed in newer versions...)

Also, toybox has a page sized buffer (char toybuf[4096];) it can use as scratch
space for short-lived allocations. This never uses that, and instead has short
lived allocations.

And some things it does are just gratuitously "this doesn't look/work like any
other toybox command", ala:

#define bcg (TT)

in lib/lib.h toybox has:

#define minof(a, b) ({typeof(a) aa = (a); typeof(b) bb = (b); aa<bb ? aa : bb;})
#define maxof(a, b) ({typeof(a) aa = (a); typeof(b) bb = (b); aa>bb ? aa : bb;})

And your bc has:

#define BC_MAX(a, b) ((a) > (b) ? (a) : (b))
#define BC_MIN(a, b) ((a) < (b) ? (a) : (b))

(Which doesn't protect against multiple evaluation of the macro argument like
the first does.)

You have your own hand-rolled flag macros:

#define BC_FLAG_WARN (1<<0)
#define BC_FLAG_STANDARD (1<<1)
#define BC_FLAG_QUIET (1<<2)
#define BC_FLAG_MATHLIB (1<<3)
#define BC_FLAG_INTERACTIVE (1<<4)
#define BC_FLAG_CODE (1<<5)

When toybox is generating FLAG_w and FLAG_s and so on already.

You have a fairly standard "success is zero" semantic but use a
BC_STATUS_SUCCESS enum for the zero. This means callers could use if (!thingy())
but it's not obvious that they can because you hide it, and make it seem like
this semantic could change in future?

You have a lot of things only used once or twice, like bc_func_insertAuto() that
might as well just be inlined.

This is just the stuff obvious at a quick glance. I haven't done proper code
review yet. The whole destructor thing looks like it could be redesigned out,
but that's a can of worms. There's snippets like:

      case BC_INST_PUSH_OBASE:
      {
        result.type = BC_RESULT_OBASE;

Where I go "why are those two different values? Could the code be simplified if
they were the same value?" But again, haven't traced down that path yet. I need
to read and follow the logic first, and at 9000 lines that's an undertaking.

> If you would still like to change my submission, please work with me to do so. I
> am entirely willing to do so (I have already written the bc in toybox style and
> will continue to improve it if possible). If you do, I can maintain into the
> future because I already have a script to cut releases of bc for toybox. If you
> do not work with me, and change it yourself, it would very difficult for me to
> maintain it for you.
> 
> I look forward to working with you.

So instead of applying a series of cleanup patches, you want me to send them to
you and then pull them back from you?

*shrug* I can do that.

But what I'd really like is regression tests...

> Gavin Howard

Rob



More information about the Toybox mailing list