[Toybox] [PATCH] Distinguish 32- and 64-bit variants in file(1) for x32.

enh enh at google.com
Sun Feb 28 08:18:21 PST 2016


On Sun, Feb 28, 2016 at 7:48 AM, enh <enh at google.com> wrote:
> On Sat, Feb 27, 2016 at 10:51 PM, Rob Landley <rob at landley.net> wrote:
>> On 02/27/2016 03:02 PM, enh wrote:
>>> On Sun, Feb 21, 2016 at 8:30 PM, Rich Felker <dalias at libc.org> wrote:
>>>> On Sun, Feb 21, 2016 at 08:42:06PM -0600, Rob Landley wrote:
>>>>> If the script wants to match "Intel 80386" explicitly, then do I have to
>>>>> say that for i686?
>>>>
>>>> I would think it makes sense to preserve the "Intel 80386" convention
>>>> here. There's not even a reliable way to detect that a binary is for
>>>> "i686" anyway.
>>>
>>> i also think it makes sense to use the same names as the GNU file(1),
>>> because for lack of a real standard, "what the file(1) that everybody
>>> is is running does" is about as close as we'll get to a standard.
>>
>> build/root-filesystem-powerpc/bin/toybox: ELF 32-bit MSB  executable,
>> PowerPC or cisco 4500, version 1 (SYSV), statically linked, stripped
>>
>>
>> So the name there is "PowerPC or cisco 4500"? PowerPC doesn't have a
>> vendor but cisco does? It's not even capitalized consistently.
>>
>>> yes, their names are an awful, inconsistent historical mess,
>>> but that's how real standards tend to turn out anyway :-)
>>
>> And when Posix does that I document my deviations from the standard. :)
>>
>> At $DAYJOB I'm working on our clean room reimplementation of superh,
>> which we started after Renesas abandoned the platform. I admit having
>> that be identified as "Renesas" rankles a bit. (Especially since it was
>> originally developed by Hitachi, and part of the reason Renesas
>> abandoned it was some "not invented here" politics about stuff they'd
>> inherited before the spin-off vs technology developed afterwards...)
>>
>>> if we think those names are too big a mess to stomach (and i can
>>> certainly understand that POV, at least until using a cleaner set is
>>> proven to cause trouble for actual scripts), then i think using the
>>> constant names from the kernel's uapi/linux/elf-em.h is fine too.
>>
>> Newly introduced platforms tend to have EM_MANUFACTURER_ARCH and then
>> later switch it to EM_ARCH. Here's the commit that did that for
>> Microblaze, for example:
>>
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=69515f8b957a
>>
>> I was going with the arch directories because they don't change as much.
>> I've now refined that with two additional rules:
>>
>> 1) If there were different 32/64 bit directories that were later merged,
>> stick with the old names. This gives us the x86-64 and 386 you wanted,
>> and ppc and ppc64.
>
> and arm and aarch64? as long as the ARM and Intel names are
> distinguishable, i don't care about S/390 or any of the other junk.
>
>> 2) Every numerical ID should have a unique name, which leads to some...
>> "interesting" cases. ("thing" vs "thing-old" is easier, high 0xbeef
>> style numbers vs the low numbers Linux accepts. But some are like tilegx
>> and tilepro, which one is "tile"? I went with the numerically lower one
>> for that. And sparc has three. And mips _still_ says that the 10 value
>> isn't used, which seems to be the case, I checked a mips64 binary and
>> it's also using 0x08...)
>>
>>> (they're at least fairly logical.) i am aware that they say 386 and
>>> PPC, but if we're aiming for full compatibility with everyone else's
>>> file(1) we don't want to go this route anyway!
>>
>> Is there more than one implementation here, or are we just saying
>> "everybody else uses darwinsys.com/file"?
>
> as far as i know, there's only one. i've never come across another,
> even on Mac OS.
>
>>> i don't think that any
>>> _human_ sophisticated enough to be looking at file(1)'s output for an
>>> ELF file is going to be confused by "386" vs "Intel 80386" or "PPC"
>>> for "PowerPC" :-)
>>>
>>> on the other hand i definitely _don't_ think the world needs a _third_
>>> "standard".
>>
>> Posix doesn't standardize this! (Neither does ELF!)
>
> i meant de facto standard.
>
>>> i'm happy to provide a patch for either of "file(1) names" or "kernel
>>> elf-em.h names" if we can agree on which...
>>
>> I actually locally fixed this up several days ago (not necessarily a
>> final thing but the next iteration at least), I just haven't finished
>> going through the rest of Isaac's objections yet.
>>
>> (I'm juggling a half-dozen things again. Trying to catch up this
>> weekend, but today wasn't the day.)
>>
>>> (i can also supply hello world ELF binaries for all six architectures
>>> Android supports, which -- even if you do set up your qemu instances
>>> -- might still be mildly interesting because they have some slightly
>>> different ELF notes than one sees in desktop linux ELF binaries.)
>>
>> That would be very helpful, thank you.
>
> will do.

what exact subset would you like? it's ~2MiB stripped for static and
dynamic, 32 and 64, for the three families. it's ~14MiB for the same
but unstripped instead. and what's the best way to deliver them?

>> Rob
>
>
>
> --
> Elliott Hughes - http://who/enh - http://jessies.org/~enh/
> Android native code/tools questions? Mail me/drop by/add me as a reviewer.



-- 
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.

 1456676301.0


More information about the Toybox mailing list