[Toybox] [PATCH] Fix truncate.test for macOS.

Mon Jul 4 03:21:21 PDT 2022

On 6/27/22 14:40, enh wrote:
> On Sat, Jun 25, 2022 at 3:45 PM Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
> 
>     On 6/24/22 19:35, enh wrote:
>     > On Thu, Jun 23, 2022 at 11:48 PM Rob Landley <rob at landley.net
>     <mailto:rob at landley.net>
>     > <mailto:rob at landley.net <mailto:rob at landley.net>>> wrote:
>     >
>     >     On 6/22/22 20:02, enh wrote:
>     >     > On Wed, Jun 22, 2022 at 1:52 PM Rob Landley <rob at landley.net
>     <mailto:rob at landley.net>
>     >     <mailto:rob at landley.net <mailto:rob at landley.net>>
>     >     > <mailto:rob at landley.net <mailto:rob at landley.net>
>     <mailto:rob at landley.net <mailto:rob at landley.net>>>> wrote:
>     >
>     >     The problem with the mac tar test is even though it's easy enough to
>     find what
>     >     /etc/passwd calls UID 0:
>     >
>     >     ROOT="$(sed -n '/[^:]*:[^:]*:0:/s/:.*//p' /etc/passwd)"
>     >
>     >     That doesn't change the fact it'll be putting a different string into the
>     >     tarball, with different sha1sums. Um. (I was using "root" as the one known
>     >     constant account that didn't vary across distros. Possibly I need a
>     way to tell
>     >     it to use an alternate /etc/passwd file to lookup usernames. This is
>     why I've
>     >     been poking at mkroot, but making that work on a mac is just... ow.)
> 
>     FYI, I committed your patch shortly after sending that message.
> 
> thanks. interestingly, i realized that i think we also wouldn't get a red cross
> in the github ui if we broke the _linux_ tests?

Hmmm... looks like I broke VERBOSE=all's exit code when I moved the tests into a
subshell (commit e00b4c26553b) and de474ba03950 wasn't a full enough fix...

Try now?

> it's only a build failure that
> counts? not obvious to me from the .yaml syntax why that is/what we could do
> about it.

Wasn't the yaml, it was my fault.

(I semi-regularly do "make distclean defconfig toybox tests" but that stops at
first failure with a visible result, so I hadn't noticed the return code...)

That said, I'm not sure the MAC tests failing is really something we're ready to
call a failure yet?

>     I can't immediately think of a better short-term fix, with the possible
>     exception of tagging tests as "known to fail on macos because macos is buggy".
>     (And "we extended a zero length file three times with truncate() and along the
>     way it allocated a megabyte of storage to store LITERALLY NO DATA" sounds like a
>     bug to me. I am neither interested in fixing nor reporting MacOS bugs because
>     they're 100% proprietary with 0% open source input, and they ain't paying me to
>     make them richer thanks. For the same reason, I don't want to put a lot more
>     cycles into _thinking_ about macos either.)
> 
>     The mkroot stuff is all about "I can mount ext2 or tmpfs to run this test on and
>     have exactly known behavior". I understand "somebody ran the test on xfs and it
>     behaved differently than any other filesystem so far", but I think this is a bug
>     in the VFS layer in a test environment I haven't got. When porting tests into
>     mkroot, I'd presumably do some annotation for "this test runs in the
>     known/mkroot environment" anyway, and logically I'd tag the ones that have known
>     problems outside that environment, whatever those problems may be...
> 
> or we could have a more specific fs-specific "skipnot", since "what fs is this?"
> seems to be one of the most common problems.

Is there a portable way to determine filesystem type, though? df . doesn't say,
I have to look in /proc/mounts and I doubt mac has that?

$ grep -w "^$(df . | tail -n 1 | toybox cut -DF 1)" /proc/mounts | toybox cut -DF 3

(I THINK that if the device has a space in it df will output the escaped
version, which should match for grep... But again: Linux.)

>     Another TODO item is packing up debootstrap and alpine root filesystems to test
>     under mkroot as more rigorous "TEST_HOST=1" runs. With the kernel
>     config/version, mountpoint selection, and qemu board emulation parameters.
>     Presumably running my init script instead of theirs to do the setup and start
>     the test, but using their $PATH of binaries (gnu/fsf and busybox, respectively).
>     But that's after I get the base mkroot testing well...
> 
>     > note that it's /etc/*group* that's weird, not *passwd*. uid 0 is root, but
>     group
>     > 0 is wheel. (i think that's true of all bsds?)

I suspect the easy solution is "skipnot grep -qw root /etc/group".

Because it's not gonna make the same tarball otherwise, and this is testing data
fetched out of /etc/passwd. (We have --owner=NAME:ID tests: this isn't that.)

>     I'm not spotting any negative gids in /etc/group on devuan. And I think that
>     violates posix?
> 
>     https://pubs.opengroup.org/onlinepubs/9699919799/functions/chown.html
> 
>     chown(-1) means "don't change". So you can't set it to -1 through the posix
>     specified API.
> 
> they're just taking advantage of a scanf("%u") somewhere else. the interesting
> part is that this means those _aren't_ actually the 64Ki and 64Ki-1 i was expecting:

Wait, scanf("%u") will accept an input starting with a minus sign?

Yes, posix says so, and the boilerplate says they got that from C99. They defer
to strtoul() which says "If the subject sequence begins with a <hyphen-minus>,
the value resulting from the conversion shall be negated" but nothing about what
strtoul() should do about that...

(I'm interested because the C++ loons who hijacked gcc development declared
signed integer wrapping to be "undefined behavior", a thing I noticed a few
years back when the compiler "optimize out" code that did that with a constant
at compile time. I had to typecast it to unsigned and then back again to do the
math...)

> ~$ id nobody
> uid=4294967294(nobody) gid=4294967294(nobody)
> groups=4294967294(nobody),12(everyone),61(localaccounts),100(_lpoperator)
> 
> but the small positive ones look okay?

Linux uses 65534 for nobody, and it's the OTHER magic UID in Linux hardwired
into the kernel. Well, these days they give you a gratuitous sysctl to change
it, but I've never seen it used. And they don't have an equivalent sysctl to
move root off of UID 0.

include/linux/highuid.h:#define DEFAULT_OVERFLOWUID	65534

It's kind of like the new ping API needing you to
/proc/sys/net/ipv4/ping_group_range before it can be used. Linux kernel
development collapsed into a pile of bureaucracy some years back. Kind of sad.

They just broke my "build with gelf" fix for a third time:

/home/landley/toybox/cleanser/root/build/x86_64-tmp/linux/tools/objtool/include/objtool/elf.h:10:10:
fatal error: gelf.h: No such file or directory

No other architecture requires you to install a magic extra elf package on your
host, but the x86 maintainers are insane:

https://lore.kernel.org/lkml/20211024192742.uo62mbqb6hmhafjs@treble/

(No other target needs this. I'm building kernels to run in a dedicated
environment. I don't need spectre/meltdown mitigation, and Linux has THREE
COPIES of elf plumbing in the tree already before this! Pulling in an external
package to do ELF stuff is just SAD here...)

>     $ rm -f empty; for i in 1k 1m 1g; do truncate -s $i empty; stat -c %b empty;
>     done; ls -l empty
>     0
>     0
>     0
>     -rw-r--r-- 1 landley landley 1073741824 Jun 25 17:15 empty
> 
>     I'm guessing it's not gonna say 0.
> 
> ~/toybox$ rm -f empty; for i in 1k 1m 1g; do ./toybox truncate -s $i empty;
> ./toybox stat -c %b empty; done; ls -l empty
> 0
> 2048
> 2048
> -rw-r--r--  1 enh  primarygroup  1073741824 Jun 27 12:38 empty
> ~/toybox$

So "truncate -s 1k empty; truncate -s 1m empty" allocates a megabyte of disk space.

The second truncate makes the file NOT SPARSE.

> the weird part for me was that it wasn't obvious _what_ the non-zero number was
> going to be. 

The bug seems to be if you extend a sparse file the result is not sparse. I'm
guessing if you go:

truncate -s 1k empty; truncate -s 2m empty; stat -c %b empty

You'll get 4096. (Because 2048*512=1m and that's what we asked it to expand the
sparse file to.)

>     >     They swapped to zsh three years ago:
>     >
>     >   
>      https://www.theverge.com/2019/6/4/18651872/apple-macos-catalina-zsh-bash-shell-replacement-features
>     >
>     >
>     > oh, yeah, good point. my reaction to that was similar to your reaction to
>     dash.
>     > _i'll_ be using bash on macos until they remove it.
> 
>     All my stuff says #!/bin/bash at the top, but I dunno how github is running
>     what...
> 
> just `VERBOSE=all make tests`

Which might be fixed now, for a definition of "fixed" that means "notices it's
failing".

Lateral progress!

>     >     I'm tempted to borrow my wife's mac for a bit, but I have no idea how
>     to set up
>     >     a development environment on a mac. The first google hit is
>     >     https://sourabhbajaj.com/mac-setup/Xcode/
>     <https://sourabhbajaj.com/mac-setup/Xcode/> which looks... more elaborate than I
>     >     want to do on a borrowed machine.
>     >
>     > iirc it's a bit simpler than that (if you don't have some company policy that
>     > says you can only install binaries from their servers) --- you just run "make"
>     > and it pops up a window saying "you want to install all that shit?" and
>     you say
>     > "it's a unix system; i know this", bish bash bosh, job done.
> 
>     Except I can't easily _undo_ it afterwards and don't want to eat I dunno how
>     many gigs on my wife's machine with Apple's soldered-in ssd.
> 
> yeah, it's not small.

It's a pity there isn't a mac dev environment I can ssh into. I can can think of
three Linux ones off the top of my head that are still up which I probably have
credentials for. (Make that five. Probably more if I thought about it.) But then
Fabrice Bellard's jslinux run a Linux vm in a web page, so "lemme try this out
really quick" was never a high bar for Linux.

Mac, not so much...

Rob