[Toybox] [PATCH 1/2] sh: fix memory managment (e.g. for HERE documents)

Rob Landley rob at landley.net
Sun Nov 6 07:12:42 PST 2022


On 11/6/22 03:13, Alexander Holler wrote:
> The memory allocation of arg_add() should be in sync with what is used e.g.
> in parse_line().

No, it allocates in larger chunks to reduce the number of calls to realloc() and
potential calls to memmove(), mmap(), and cache polluting heap list traversals.

> The allocation In parse_line() just reserved on arg, but
> arg_add() assumed 32 args have been reserved. The result was a memory
> corruption.

Test case please?

That will realloc() every call instead of every 32 calls. I try to avoid that
for the same reason I avoid write(1) byte-at-a-time transfers.

The first call should be at arg->c = 0 so it should allocate 32+1 entries (to
simplify fencepost checks of last entry). Then when we try to add entry 32 it
should extend it to 64+1. The size of the expansion is irrelevant-ish (the
larger it is the fewer times it's called, the smaller the less memory we're
potentially wasting at the end, 32*8=256 bytes which seemed a reasonable step
size while big enough 95% of command lines should get away with just one
allocation.)

Where did you see a memory corruption? The code in parse_line should never
increment arg->c directly. There's a couple places it negates arg->c for
signaling purposes (means we're returning to request another line of input which
will be appended to the current transaction because of unterminated quotes or
trailing && or some such), and this bit's slightly magic:

    // Extend/resume quoted block
    if (arg->c<0) {
      delete = start = xmprintf("%s%s", arg->v[arg->c = (-arg->c)-1], start);
      free(arg->v[arg->c]);
      arg->v[arg->c] = 0;

Which undoes this:

      // Save unparsed bit of this line, we'll need to re-parse it.
      arg_add(arg, xstrndup(start, strlen(start)));
      arg->c = -arg->c;
      free(delete);

The second saves the unparsed portion of the line, the first pops the unparsed
bit off and glues the new line to the end of it in one go. (The line that's
passed _in_ to us can be lost because it's the caller's job to free it.)

Rob


More information about the Toybox mailing list