[Toybox] [PATCH] xargs: avoid "Argument list too long".

enh enh at google.com
Tue Oct 22 13:31:03 PDT 2019


okay, i've also sent out the alternative patch (search your inbox for
"proper") if you want to compare. i don't personally have a strong
opinion between the two. macOS seems to inherit BSD's "proper"
implementation, and findutils seems to just assume the 128KiB limit.

a few new things i learned though:

* the 128KiB limit is actually motivated somewhat by being the
kernel's minimum limit; even if you set the stack so that you'd get a
smaller limit, you still get 128KiB (strictly, 32 pages, but all my
hardware has 4KiB pages, so...).

* the findutils implementation is a bit cleverer than i realized. if
you dick about with strace and `xargs -s`, you'll see that if you run
it over the kernel limit it catches the E2BIG and retries with fewer
arguments!

* the kernel also has separate limits on the maximum length of any
individual string, and on the count of strings (separate from the
space taken by those strings, or their pointers). but i'm assuming
YAGNI. we can add those limits if anyone ever manages to hit them in
real life. (i don't think there's anything we can do in the former
case anyway, so the default E2BIG error message isn't terrible.)

i do wonder who uses -s in practice, and whether they really should be
doing so. -n, sure, that totally makes sense. but -s seems like a
mistake to me.

(note that unlike the 128KiB hack, i haven't tested this patch in the
actual mac SDK or kernel build scenarios that have been failing. let
me know if you'd rather go this way, and i'll explicitly test those
cases too. but this patch doesn't break existing tests and does fix my
/usr proxy test.)

On Mon, Oct 21, 2019 at 3:59 PM enh <enh at google.com> wrote:
>
> ping. this is the missing one. let me know if you'd rather fix this
> the "right" way, even though findutils really is this stupid. i don't
> have a strong preference between "obviously right, but apparently
> unnecessarily complicated" versus "obviously terrible, but easy and
> no-one seems to care in practice" :-)
>
> On Sun, Oct 20, 2019 at 10:39 PM enh <enh at google.com> wrote:
> >
> > It turns out that findutils xargs almost always uses an artificially low
> > limit of 128KiB. You can observe this with --show-limits (which I
> > refrained from adding to toybox since I believe it's only useful if
> > you're debugging a broken xargs).
> >
> > I think an alternative fix for all this would be to go back to counting
> > the cost of the (char *)s, but separate the -s limit from the "system"
> > limit --- that way we could have the same behavior as findutils xargs
> > for explicit values of -s (which we all seem to agree should *not*
> > include the cost of the (char *)s), but also not accidentally overrun
> > the actual system limits when we do count the (char *)s. That's more
> > complicated though, and findutils' "128KiB is enough for anyone"
> > behavior is demonstrably "good enough", so let's go with that for now.
> >
> > Tested by building an Android common kernel with toybox xargs, which
> > failed before.
> >
> > Also tested with `find /usr | xargs > /dev/null`, which fails with
> > toybox xargs even on my laptop.
> >
> >   ~$ find /usr | wc
> >    262823  262835 15132761
> >
> > (On my desktop, even `find /proc` is sufficient to hit this!)
> >
> > Bug: http://b/140269206
> > ---
> >  toys/posix/xargs.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)


More information about the Toybox mailing list