[Toybox] xargs

Fri Nov 15 17:59:04 PST 2019

On 11/15/19 3:29 PM, enh wrote:
> On Thu, Nov 14, 2019 at 8:29 PM Rob Landley <rob at landley.net> wrote:
>>
>> On 11/14/19 3:15 PM, enh via Toybox wrote:
>>> okay, finally found time to work out why the new xargs tests fail on
>>> Android. (as you can probably guess from the patch to differentiate
>>> the error messages again!)
>>
>> So the build is working and it was just the test failing?
> 
> TL;DR: yes.
> 
> this has definitely been confusing because there have been so many
> issues. the status at the beginning of the week (with a ToT toybox,
> not what was actually checked in to AOSP, which was still a few weeks
> behind) was:

I'd like to clear up the issues preventing you from moving forward. If yanking
that test does it (or at least commenting out out for a while to give you a
window), I can do that.

>> The kernel can run a command with a single argument that's 128k-1 byte long, and
>> when _SC_ARG_MAX returns 128k xargs can't, and there's no "split the line"
>> fallback: it just can't do it. That does kind of bother me.
> 
> being more risk averse than you, i was happy with coreutils' general
> "128KiB is enough for anyone", but in the specific case of a single
> argument, i see your point :-)

I want to do the right thing. I am sometimes unclear on what the right thing is.

>> Right now, rlimit(stack)/4 with a ceiling at 6 megs is what the kernel is
>> actually doing. The 6 megs is from kernel commit da029c11e6b1 adding 3/4
>> _STK_LIM and _STK_LIM is 8*1024*1024 in include/uapi/linux/resource.h
>>
>> The 1/4 stack limit was in the original 2007 commit, Kees Cook added a
>> gratuitous "How about 6 megs! There are no embedded systems with less than 6
>> megs! I define what 'sane' is!" extra limit that made NO sense and triggered the
>> above thread with Linus (it's not a fraction of the memory the system has, not a
>> fraction of available memory, merely a number that Felt Right To Some Guy), but
>> it hasn't moved for 2 years now and older kernels wouldn't really be negatively
>> impacted by the 6 megabyte ceiling. And that covers our 7 year time horizon (12>7).
> 
> (according to the kernel source, _STK_LIM is the default stack size
> [8MiB], so 6MiB is 3/4 of the default stack size.)

8 megs was arbitrary. 3/4 of 8 megs is arbitrarier. Linux 0.12-ish ran in 2 megs
and in the embedded world still does. (There was a lovely video about running
Linux in 256k of sram via ROM xip including an uncompressed romfs initrd, but
alas it was a victim of the Great Linux Foundation Youtube Oops. They've
recovered the 2016 videos, and _say_ they found a backup of the 2015 videos
https://lists.celinuxforum.org/g/Celinux-dev/message/1202 but have yet to
republish them. Still: it can be done.)

The embedded space is this weird "enormous but invisible" thing. There are an
order of magnitude more nommu systems than with-mmu systems the way a human body
has more symbiotic bacteria than human cells. But the people doing it lost their
taste for interacting with linux-kernel a decade before the rest of the world
did, so you don't hear about it much and then go "why did this this ship a
2.6.16 kernel in 2018? Sigh...) But that still means the intersection of "wants
to run current stuff" and "embedded linux" is both large and terrible at
communicating with Red Hat's Big Iron/Enterprise Linux and Ubuntu's developer
workstation niche because ew.

>> I think the fix here is to change what Bionic returns? If they break it again,
>> we can change it again. You can't future proof against gratuitous nonsensical
>> changes coming out of the blue. (A process calling exec() can trigger the OOM
>> killer, sure. A process filling up its stack can trigger the OOM killer. Why the
>> special case here?)
> 
> my main reservation about duplicating the kernel logic was just that
> it's difficult to track changes. but it's been stable enough for long
> enough that i'm not particularly concerned.

At this point if they break us again we can complain: 2 years of stability and
12 years of compatibility with the current constraint.

> i've reverted bionic to where it was, and am now back in sync with ToT
> toybox. (i've added the 3/4 _STK_LIM special case too, which bionic
> didn't have before we switched to just returning ARG_MAX.)

Yay!

>> That said, I can yank the test if that's easier. I don't want to adjust the test
>> because the _point_ of the test was to see if we can handle The Largest Possible
>> Argument, and when we can't it did its job and found an issue. If we want to
>> accept not handling the largest possible single argument the kernel can take,
>> the answer is to yank the test, not test something smaller than isn't a real
>> boundary condition.
> 
> like i said, for the general limit i like being conservative -- i care
> about not breaking random builds that we haven't tried yet more than i
> care about squeezing out every last byte -- but for this specific
> single-argument limit i get your point.
> 
> (i'll carefully not mention that this same test [but not the other new
> limit test --- that's fine] doesn't pass on darwin. all the toybox
> tests are so far from passing on darwin that it's not even worth
> talking about. the host tools are so terrible that we can't rely on
> them while testing a toy --- we'd really need to run in an all-toybox
> environment.)

I'm working on getting you one. I got dragged into all-$DAYJOB-all-the-time
(even weekends) mode for a month, but I hope to come up for air the week of the
25th. (That said, I really want to get the next round of toysh checked in. But
it sounds like I should redo expr.c because interest, and that's half of $(( ))
and for ((;;)) and if ((a+7<3)) shell logic anyway...)

Is there a "make airlock" style list of what darwin would need that's not in
defconfig yet? (Or needs fixing on darwin?)

Rob