[Toybox] Test suite gripe du jour.

enh enh at google.com
Mon Aug 7 09:06:56 PDT 2023


On Sat, Aug 5, 2023 at 2:50 PM Rob Landley <rob at landley.net> wrote:
>
> So the failing MacOS test was:
>
> FAIL: tail -F
> echo -ne '' | tail -s .1 -F walrus 2>/dev/null & sleep .2; echo hello > walrus;
> sleep .2; truncate -s 0 walrus; sleep .2; echo potato >> walrus; sleep .2;
> echo hello >> walrus; sleep .2; rm walrus; sleep .2; echo done > walrus;
>   sleep .5; kill %1
> --- expected    2023-08-01 04:09:53.000000000 +0000
> +++ actual      2023-08-01 04:09:55.000000000 +0000
> @@ -1,4 +1,3 @@
> -hello
>  potato
>  hello
>  done
>
> Which means we ran this background process:
>
>   tail -s .1 -F walrus
>
> And then ran:
>
>   sleep .2; echo hello > walrus; sleep .2; truncate -s0 walrus
>
> Meaning during the 2/10 of a second sleep between the echo and the truncate,
> tail's 1/10 of a second of sleep did not finish and resume running.
>
> Tenth of a second sleeps should be an ENORMOUS amount of time for modern
> hardware, where "modern" includes the first generation of raspberry pi going for
> $35 in 2012 with the filesystem on an sd card. I need to keep the sleeps short
> because a lot of tests use them and they add up.
>
> Unfortunately, if you run such tests on hardware that's outright thrashing its
> resources introducing funky latency spikes, then in theory even a 2 second sleep
> isn't necessarily long enough. (Thunderbird has gone to lunch for 8 seconds at a
> time even with an SSD, when that and chrome fight to see which can bloat larger,
> it gets ugly.)
>
> Sigh. Maybe on MacOS I should run failing tests a second time to see if it fails
> again? That does not seem right. I could also have every sleep be a full second
> on MacOS, but I'm not convinced that's long enough.
>
> Rob
>
> P.S. I didn't respond to Elliott's last email about testing because I didn't
> know what to say. "I want absolutely everything because I have a dedicated staff
> to weed out false positives" is not my use case.

unless you mean "you and me", there's no "staff" here :-)

> I want to know if I broke
> toybox. In a lot of the github test failures, _toybox_ isn't what broke. I'm
> aware that Posix and the Linux Test Project and so on aren't ideal, but I can't
> do their jobs and mine.

that wasn't my point --- my point was "you'll be doing that
regardless". you would (i assume) never include btrfs in your qemu
setup, but you'll still get bug reports from folks using it. and
you'll never be certain that your testing is thorough enough that you
can just ignore bug reports as "can't be toybox; must be your
kernel/fs/whatever", so you probably shouldn't put too much effort
into qemu _in the hope of being free_, but that it's useful _anyway_
in the same way a "works on my machine" datapoint is always useful.

> Way back when toybox commit d6f8c41e2542 shrank Divya's
> initial chmod.tests submission way down because the initial submission was
> mostly testing the syscalls, not toybox. Which is nice but not what _this_ test
> suite is trying to accomplish.

sure, but you'll never really escape that. i have an "ndk" bug right
now where (apparently) readlinkat() sometimes returns a bad result, on
some devices. but not reproducibly enough. without getting to the
bottom of that (and proving "bad kernel" or "bad security layer" or
"bad vendor hack to libc" or whatever), it'll stay on the books as a
possible bionic bug. (because it _could_ be, even if it's really hard
to imagine how.)

> I need to get the linux from scratch build
> reproduced under mkroot because that was my big real world dataset. Ideally I'd
> then build either debootstrap or alpine's package repository under the result.
> (Red Hat's gone full vogon, SuSE's business model is offering a second source to
> Red Hat's customers, and Gentoo turned out to be nuts under the surface where
> every ebuild file in the portage tree has a list of every architecture it's
> allowed to build on so you can NEVER just build "for this architecture" and
> adding a new architecture requires touching every file in the tree, and don't
> get me started on the insane ebuild #include stack...) And then I'd love to get
> AOSP working under that result because that's Elliott's big real world dataset.

eh, like i've said before --- AOSP is the one place you can be sure
someone else will be testing. (though not in CTS, which does mean
there's potential for vendor breakage, including them deciding to ship
btrfs :-) )

> But I really really really need to disentangle AOSP into layers to make that
> tractable, and am not even _looking_ at that can of worms yet.
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net


More information about the Toybox mailing list