[Toybox] [PATCH] file: basic Mach-O universal binary support.

Rob Landley rob at landley.net
Wed Sep 1 15:54:01 PDT 2021


On 9/1/21 3:20 PM, enh wrote:
> On Wed, Sep 1, 2021 at 1:04 PM Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
> 
>     On 8/31/21 6:23 PM, enh via Toybox wrote:
>     > PowerPC may be dead and gone, but arm64 is the new x86-64, and
>     > x86-64 the new PowerPC :-)
>     > ---
>     >  tests/file.test   |  4 ++++
>     >  toys/posix/file.c | 36 +++++++++++++++++++++++++++++++-----
> 
>     Grrr, clashes a lot with my tests/file.test local changes. (My tree accumulating
>     unfinished changes is yet another tabsplosion-style accumulation of minor
>     technical debt.) Anyway, backed my changes out to apply yours, then applied them
>     again.
> 
>     It's a pity you don't use bash to run the test suite, because $'\x0a\xbc\xde' in
>     the "input" argument would be a more concise way of doing those tests. (Yes, I
>     need to find time to work on toysh again...)
> 
> speaking of which... your use of fancy bash stuff upsets both Android and macOS.
> here's the complaint from macOS, but iirc Android is similar:
> 
> ~/toybox$ make test_file
> scripts/test.sh file
> scripts/runtest.sh: line 217: syntax error near unexpected token `;'
> scripts/runtest.sh: line 217: `      R) LEN=0; B=1; ;&'
> 
> seems not to actually cause trouble in either case, though, so i'd been ignoring
> it for now.

It's a terminator that falls through. Without a terminator it presumably won't
evaluate the next pattern as a jump target, and this is the one doesn't "continue".

I"m not sure how to do that otherwise? (I suppose I can if/else staircase it,
which is what I _usually_ do, but I thought I'd been "proper" and use
switch/case...)

>     Your TEST_HOST output didn't match my TEST_HOST output on devuan. Is this
>     version skew in the "file" command or did you not run TEST_HOST? I get:
> 
>       universal: Mach-O universal binary with 2 architectures: [x86_64] [arm64]
> 
>     Which has [name] instead of commas, and uses x86_64 instead of x86-64.
> 
> oh, interesting. *my* host file wasn't giving output that useful. (and macOS
> file is different from linux file here.)

I'm not trying to match the macos TEST_HOST output. :)

> yeah, i'm not particularly wedded to the specific format i used, and macOS used
> [] too, but included more info:
> 
> ~/toybox$ file /bin/sh
> /bin/sh: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit
> executable x86_64] [arm64e:Mach-O 64-bit executable arm64e]
> /bin/sh (for architecture x86_64): Mach-O 64-bit executable x86_64
> /bin/sh (for architecture arm64e): Mach-O 64-bit executable arm64e

Once upon a time the output of unix commands was designed to be consumed by
other programs.

I miss those days. Before the dark times. Before the <strike>empire</strike> gnu
project.

> (note that that's really three lines for one file, which seems quite a major
> script-breaker!)
> 
> anyway, my choice of x86-64 was just because that's what we've used for
> everything else; ELF, Windows, and even regular single-architecture Mach-O
> files. personally i tend towards internal consistency rather than outward
> consistency when they differ, but the opposite argument is obviously equally
> valid. you can't really win. but if we change mach-o universal binaries, we
> should probably change regular mach-o binaries too?

I can always do horrible regex things to make the test accept both. (egrep -o
"(one|two|three)" and check that I get "one two three" as output, for example.)

>     Hmmm, I have a big list of tests I need to add to file.test and some of them
>     are...
> 
>       $ echo hello > test; chmod 000 test; file test
>       test: regular file, no read permission
>       $ toybox file test
>       file: test: Permission denied
>       test: unknown
> 
>     Eh, how much do we care about matching exactly in the corner cases? Hmmm...
> 
> yeah, there's also fancy stuff like UTF-8 vs ASCII, and guessing the language if
> it's UTF-8, but although they're kind of cool, they do cost time as well as
> code, and i haven't personally had a use for them yet, so i've been ignoring all
> the fancy bits.

Whatever we do is going to be a subset of what the one driven by a config
database does, and I'm fine with that. The question is WHEN we have a test, how
closely should the output match?

And "perfect" seems to be off the table because version skew even WITHIN the
conventional Linux "file" implementation. Sigh. (This is using file-5.35 from
2018, I'm guessing you're using newer?)

> (the specific motivating case for adding mach-o universal binary support was a
> script that's just checking for "universal binary", so they'd be fine if you
> want to change the output format.)

I have a pending cleanup to go with the updated test file, but I've checked in
yours as is for the moment and don't intend to wander too far from it.

Rob



More information about the Toybox mailing list