[Toybox] Release 0.8.10

Tue Aug 1 21:04:41 PDT 2023

On 8/1/23 19:56, enh wrote:
> On Mon, Jul 31, 2023 at 6:52 PM Rob Landley <rob at landley.net> wrote:
>>
>> On 7/31/23 09:31, enh wrote:
>> > seems like this release is in a pretty bad state?
>>
>> The tests passed locally! All of 'em! With glibc and musl! Sigh...
>>
>> > github CI is failing for both
>> > linux and macOS... linux seems to have some tar failures
>>
>> Yeah, it's those darn sparse failures again. On ext4 writing a sparse file
>> behaves determinstically-ish, but butterfly-effect-fs not so much.
> 
> yeah, that's one thing that's really weird --- sometimes the tests
> pass in github's CI anyway.

Microsoft Github.

>> Admittedly it's only user visible if you _ask_ for it, and I'm kind of tempted
>> to teach tar that "--sparse means any sufficient run of zeroes becomes #*%(#&
>> sparse whether or not the filesystem knows about it". Where "sufficient" would
>> logically be 512 byte aligned 512 byte blocks, because that's how tar thinks.
>> (It's a space savings. I don't THINK there's a maximum number of sparse extents?
>> I've even got a realloc every 512 entries in the existing loop! And a note to
>> figure out how to test that properly. Don't ask me what gnu/dammit does with a
>> multiblock sparse table, it _probably_ works?)
>>
>> *shrug* If nothing else it would eliminate the filesystem dependency...

Sigh, the gnu/dammit tar has --hole-detection=seek/raw and of course the man
page does not explain what they DO, but I'm assuming "raw" makes it sparse
whenever the data is all zeroes and seek detects the existing sparseness?

I'm leaning towards just having --sparse make it be sparse whenever it can be,
especially if filesystems you extract it into DON'T RETAIN THE INFO.

I don't THINK being more aggressive about sparsifying files when given --sparse
should breaking things? (Modulo loopback mounting filesystem images or swapon
files?)

The problem here is:

1) The toybox code is currently doing this right.

2) The build/run environment doesn't allow it to work right.

Is the point of the test to find environment problems, or to find toybox
regressions? Should the tests have an --aggressive flag of some kind? I'm
already planning a "run as root under a special 'mkroot.sh tests' image that has
known stuff in places...

This is a design issue. There isn't a right answer, it's a question of what we
want to test.

>> > FAIL: tar sparse without overflow
>> > echo -ne '' | tar c --owner root --group sys --mtime @1234567890 --sparse fweep
>> > | SUM 3
>> > --- expected 2023-07-29 01:27:20.471064281 +0000
>> > +++ actual 2023-07-29 01:27:20.475064343 +0000
>> > @@ -1 +1 @@
>> > -50dc56c3c7eed163f0f37c0cfc2562852a612ad0
>> > +4b0cf135987e8330d8a62433471faddccfabac75
>>
>> In order for this to be happening the sparse test I added at the start has to
>> pass, but then the larger saving-of-sparseness does not match the file we just
>> created on the previous line.
>>
>> I.E. the microsoft github behavior has to be INCONSISTENT within the same run to
>> trigger this. Wheee...

Although I may be premature blaming btrfs because Microsoft Github is probably
migrating infrastructure over to Windows the way they did with hotmail (it's
like Sun migrating Looking Glass to the Solaris kernel, or an alcoholic taking a
drink, they can't _NOT_ do it even knowing the consequences), so this "ubuntu"
container could actually be Windows Subsystem for Linux or using a samba mount
as its filesystem or some such. (I can't ssh into it to poke around, so...)

But I'm still not convinced btrfs is ready for primetime after the whole
"getdents() is never guaranteed to terminate" thing. (How is that NOT a denial
of service attack waiting to happen?)

Sigh, I want the commands to be portable but there's only so much I can _test_
with "same syscall returns different results". (And this isn't even the
TEST_HOST=1 version skew can of worms...)

>> Which works on both glibc and musl, with ASAN on the glibc build and when I
>> enable ASAN on the musl build the cross compiler goes "x86_64-linux-musl-cc:
>> fatal error: cannot read spec file 'libsanitizer.spec': No such file or
>> directory" so that's nice...
>>
>> > linux also dies in the sed timeout test; that seems to be a pathological case
>> > for asan because increasing the timeout to 60s also didn't pass. (though
>> > weirdly, that test is fine -- finishing almost instantly, just like non-asan
>> > -- on macOS.
>>
>> Didn't see it on debian's gcc+glibc ASAN, but mostly likely that has fewer checks.
> 
> (to be fair, i actually have no idea of the state of the gcc asan; but
> all the people _i_ know who work on asan-type stuff for a living work
> on the llvm one.)

I've more or less integrated testing with ASAN into my workflow now, but "not
ASAN enough" is likely to take a little longer...

>> > not sure whether that's a bsd/glibc difference or a linux-only asan
>> > bug. the latter seems less likely, but i'll mention it to the asan folks anyway...)
>>
>> I remind you of:
>>
>> commit c0dca293c1301a6315684703706597db07a8dbe1
>> Author: Rob Landley <rob at landley.net>
>> Date:   Sat Jun 27 03:14:49 2020 -0500
>>
>>     The bionic/clang asan plumbing slows the test down >10x, so expand timeout.
>>
>> That test is ping-ponging between a bunch of different segments (the source
>> buffer, the destination buffer, the parsed regex struct, and the stack, global
>> variables, the toybox text segment, and glibc's library text segment) and it's
>> entirely possible whatever virtual TLB setup ASAN does to catch weirdness is
>> getting thrashed. Worse now than when the 20 second timeout was enough...
> 
> /me wonders if the reason i think this is fine "on macOS" is because i
> actually mean "on an M1 because it has truly insane memory bandwidth
> [at the cost of non-upgradeable memory, of course]".

Or it has a bigger TLB or different cache eviction strategy? Something ASAN is
doing is making memory access pathological, but it's probably just be triggering
it. If Microsoft Github _is_ using WSL or WSL2 under those ubuntu images, then
the windows kernel is just about guaranteed to be doing something really stupid:
it's windows. (And that whole "Azure" cloud nonsense is running its vm with
Windows under the covers at the best of times.)

All I know is my 10 year old laptop _without_ ASAN takes 1/4 of a second to run
the test. It's got a Core i5 from 2013 and memory from a store called "Discount
Electronics". I suspect my Pixel 3a is slightly faster than this laptop.

> i'd report the results of running the asan tests on an x86-64 mac here
> ... except that it crashes immediately the first time it starts
> toybox, somewhere deep in libclang_rt.asan_osx_dynamic.dylib (their
> fault, not yours).

I plead the third.

> but, yeah, my M1 mac is passing everything quickly right now.
> 
>> Meanwhile, without ASAN wrapping date +%s.%N around the test says it takes a
>> quarter of a second on my 10 year old laptop:
>>
>> 1690850299.108588387
>> PASS: sed megabyte s/x/y/g (20 sec timeout)
>> 1690850299.342395058
>>
>> A reasonable chunk of which is the shell test plumbing. (Just two consecutive
>> "date +%s.%N; date+%s.%N" calls from the shell are .007 seconds apart on this
>> machine, nontrivial chunk of that 250 milliseconds. I think the original 10
>> second timeout was to make it reliably pass on my 66mhz Turtle board.)
>>
>> I can't think of a fix here other than disabling the test...
> 
> yeah, or skipping if $ASAN is set? :-(
> 
> for now, though, Android's CI doesn't care as long as *hwasan* is fast
> enough, and a quick test on an aosp_cheetah_hwasan-userdebug device
> says ... "can't create /expected: Read-only file system". oh. hmm.
> looks like https://github.com/landley/toybox/commit/03e1cc1e45b67ad65e5ad0ae47b7a54e68d929d5
> broke things. not sure why $TESTDIR isn't set for me? oh, because
> that's set by scripts/test.sh which we don't use --- we call
> scripts/runtest.sh directly.

Sorry, I should have emailed you specifically about that one...

> too late for that to be today's problem though... i'll look further tomorrow!
> 
> ah, fuck it, i'll only spend the evening wondering...
> 
> yes, with the obvious line added to run-tests-on-android.sh, all the
> tests pass on my hwasan build (and the sed test only takes a couple of
> seconds). (for reference, my linux/x86-64 hardware that timed out was
> a work amd threadripper box, not my personal 10 year old laptop!)

Hmmm... The test went in because a change went in because a build script was
very slow. If somebody does a build with an ASAN toybox, the slow comes back.

We _can_ remove the test, but I don't know if that's the right call? The test is
sort of doing its job? It didn't exactly find an issue with toybox, but it found
an issue that _hits_ toybox...

I'm open to suggestions.

Rob