[Toybox] [PATCH] xargs: avoid "Argument list too long".

Rob Landley rob at landley.net
Thu Oct 24 17:29:51 PDT 2019


On 10/22/19 3:31 PM, enh via Toybox wrote:
> okay, i've also sent out the alternative patch (search your inbox for
> "proper")

I'm looking, I'm trying to remember what failure it was trying to fix. (When I
have a test, coming up with a fix for it is generally easy. When I have a fix
without a test I have to try to figure out what _might_ be going wrong which
takes forever and is error prone.)

Ah, there's a test in the patch. Ok, what's going on here...

We're currently doing:

  // POSIX requires that we never hit the ARG_MAX limit, even if we try to
  // with -s. POSIX also says we have to reserve 2048 bytes "to guarantee
  // that the invoked utility has room to modify its environment variables
  // and command line arguments and still be able to invoke another utility",
  // though obviously that's not really something you can guarantee.
  bytes = sysconf(_SC_ARG_MAX) - environ_bytes() - 2048;
  if (!TT.s || TT.s > bytes) TT.s = bytes;

That sysconf is, in theory, 128k. (We had that argument on the kernel list with
Linus, I think I yanked Rich Felker into it too. It _should_ have an #if/else
test on kernel version and currently say 10 megs but I'd rather not do that in
toybox). Meaning if you don't specify a -s, you get 128k, minus however much
environment space you've currently used, minus 2048. (Which could easily make it
negative? I have no idea why we're doing that.)

The actual limit on current kernels is 10 megabytes. The limit on kernels before
that (going back to 2007) was "unlimited", so 10 megabytes works for them too.
The pathological case is single character arguments, each of which takes 10
bytes on a 64 bit system (char+null+sizeof(pointer)), so 1 million of those
would fill your 10 megs, so you'd want your default -s value to be at most 1
megabyte if you've removed the sizeof(pointer) accounting.

If sysconf(_SC_ARG_MAX) is returning 128k, that's comfortably under 1 megabyte
and all modern kernels should be able to handle it?

(Goes to check what the sysconf is returning...)

And glibc raised it to 2 megabytes. Of course they did. But that still shouldn't
cause a failure on find /usr unless the average filename length returned is
tiny? (Average file length of 9 would double to 4 megabytes, and we'd still have
6 left?)

Aha! The kernel changed behavior again. (4 seperate sets of semantics, why not?)
They're enforcing exec size at 1/4 ulimit -s, so the amount glibc is returning
is probed and accurate... modulo the 10 meg limit on top of that. :P

So we do need to account for all the bytes.)

Ok, new rule: if FLAG(s) isn't set, we accurately count the bytes. And when it
is, it's #%*(#&% pilot error. (And I'll let the silly posix -2048 cover the
sizeof(char *) for the trailing NULL on argv[] because adding it to the
initialization would drive the for loop over 80 chars).

> if you want to compare. i don't personally have a strong
> opinion between the two. macOS seems to inherit BSD's "proper"
> implementation, and findutils seems to just assume the 128KiB limit.

I don't want to check in a test with external dependencies like that, which
basically eliminate reproducibility. (I don't even use toybox source files for
this because version skew.)

If you want an arbitrarily long run of characters here's a trick:

  head -c 100000 /dev/zero | tr '\0' x

> a few new things i learned though:
> 
> * the 128KiB limit is actually motivated somewhat by being the
> kernel's minimum limit; even if you set the stack so that you'd get a
> smaller limit, you still get 128KiB (strictly, 32 pages, but all my
> hardware has 4KiB pages, so...).

The behavior before 2.6.22 was hardwired 128k, lowering it would have been a
compatibility break.

> * the findutils implementation is a bit cleverer than i realized. if
> you dick about with strace and `xargs -s`, you'll see that if you run
> it over the kernel limit it catches the E2BIG and retries with fewer
> arguments!

I explicitly asked Linus if there was a better way to do it than that, and he
said no. Dealing with this area gives me a headache now.

> * the kernel also has separate limits on the maximum length of any
> individual string,

last I checked it was 128k.

> and on the count of strings (separate from the
> space taken by those strings, or their pointers).

last I checked it was 10 megs.

> but i'm assuming YAGNI.

You can't query either limit except by checking your kernel version, and they've
changed before. I'm calling it libc's responsibility to feed me a sane
sysconf(_SC_ARG_MAX), which means libc should have those boundary checks after
the getrlimit().

> we can add those limits if anyone ever manages to hit them in
> real life. (i don't think there's anything we can do in the former
> case anyway, so the default E2BIG error message isn't terrible.)

I asked the kernel guys and they confirmed it's Underpants Gnomes Step 2. They
did the smile and everything. Since then libc updated to give the _actual_ value
requiring us to do the accounting. My response is "wheee", with the slow dour
single finger twirl.

> i do wonder who uses -s in practice, and whether they really should be
> doing so. -n, sure, that totally makes sense. but -s seems like a
> mistake to me.

You get to keep the pieces.

> (note that unlike the 128KiB hack, i haven't tested this patch in the
> actual mac SDK or kernel build scenarios that have been failing. let
> me know if you'd rather go this way, and i'll explicitly test those
> cases too. but this patch doesn't break existing tests and does fix my
> /usr proxy test.)

I'm trying to add a test suite entry now and...

$ head -c $((131071)) /dev/zero | tr '\0' x | ./xargs | wc
      1       1  131072
$ head -c $((131071)) /dev/zero | tr '\0' x | xargs | wc
xargs: argument line too long
      0       0       0

I hate the FSF and everything it stands for. Andrew Fluegelman coined the term
"Freeware" a year before the GNU manifesto shipped, and that's only because
calling it "public domain software" was going out of fashion as proprietary FUD
scaled up. (I'd tweet about it if twitter hadn't deleted my account.)

The kernel can handle arguments of size 131072 minus the null terminator, but
the biggest one fileutils xargs can handle is 131066 because they're doing the
math wrong. If I want it to reliably pass TEST_HOST, I need to use the broken
value. Wheee.

Let's see...

$ X=$(head -c 131066 /dev/zero | tr '\0' x)
$ for i in $(seq 1 200); do echo $X; done | xargs | wc
    200     200 26213400
$ for i in $(seq 1 200); do echo $X; done | ./toybox xargs | wc
     14     200 26213400

Sigh, what does success look like here... I guess "didn't error out" not any
specific number of lines?

Rob


More information about the Toybox mailing list