Rob Landley rob at landley.net
Fri Jan 19 10:20:00 PST 2024

On 1/16/24 19:22, enh wrote:
> On Sat, Jan 13, 2024 at 12:38 PM Rob Landley <rob at landley.net> wrote:
>> On 1/12/24 14:25, enh via Toybox wrote:
>> > thanks for keeping the uncompressed path!
>> Always the plan. Besides, if the binary lives on something like squashfs,
>> decompressing it twice just wastes CPU. And the standalone command builds
>> wouldn't really benefit either without a shared lib/lib.so.
>> (Sigh. You're going to want a shared toylib.so, aren't you...)
> why? we use the multicall binary with symlinks.
> (the most interesting question i have in that area is "should we have
> _two_ binaries with different selinux labels, so we can differentiate
> 'available to apps' and 'available to adb shell'?", but that's a bunch
> of work i'm not sure anyone will ever have time to do,

Creating the binaries isn't a big deal, it's just two .config files. I couldn't
speak to the selinux labels and whatever $PATH changes pull in the second
directory of symlinks on the android side.

I'm assuming the problem here is Android's policy of snapshotting the
"generated" directory instead of allowing a shell script to call sed to
regenerate the files. What is the actual policy/objection?

My _theory_ is you don't want to compile external C code and run it on your
build server for security reasons. Which is understandable if so, but does that
mean you patched all the "HOSTCC" calls out of the linux kernel build?

My android toybox checkout is a bit stale (November 10) and this airport hasn't
got internet, but the android/linux/generated I've got lying around has:

  config.h  flags.h  globals.h  help.h  newtoys.h  tags.h

(Query: why does your .gitignore not have "generated" instead of 10 different
things under generated? Also, why "change/" but "/kconfig"?)

Two of those headers, flags.h and help.h, are washed through C code. The rest
(config.h, globals.h, newtoys.h, tags.h) are all created by echo and sed.

I note that config.h is _always_ rebuilt from .config by scripts/make.sh
(presumably overwriting your snapshot version) because the dependency is
commented out:

  #TODO: "make $SED && make" doesn't regenerate config.h because diff .config
  if true #isnewer config.h "$KCONFIG_CONFIG"

I.E. config.h doesn't record _which_ .config file it was produced from, so
switching between single and full builds confused it because "this file is newer
than that file" doesn't help when "this file" is a moving target, so I just
commented out the isnewer and put "true" in there, with a TODO to do more design
work here.

While I could add a comment to config.h to say which file it was from and teach
the script to parse that comment... that's brittle and ugly. An elegant fix
_removes_ complexity, and my todo item here is actually "try to parallelize
header file creation and always do it". Which I haven't because you snapshot
headers, and I need to find out why.

> and the app
> compat issues of trying to make that split would be a lot of trouble.

This I couldn't speak to. (Presumably the issue is weaning apps that
inappropriately use commands off of them, since they're no longer in the $PATH?)

What would the two pools be, anyway? It seems reminiscent of the /bin vs /sbin

> i assume. i don't actually have any idea, or any good way of knowing,
> what apps are calling what toys.

I've done this already for system bootstrapping, mkroot/record-commands is a bit
overkill for this, but the technique could presumably be scaled down to set a
bit in a scoreboard or something. (I needed to know the command line so I could
reproduce/debug behavior divergences, if you just want to know which files got

Or if I get the strong/weak symbol changes in, a wrapper around toy_singleinit()
or similar could live in lib/portability.c and do extra setup before/after
calling the original. Although the more logical thing to do THERE might be to
have bionic's dynamic linker do it so you could log ALL executable launches.
(Fire off a thread to record it and it shouldn't add measurable latency on an
SMP system, plus exec isn't _that_ common and already fairly expensive as
operations go. You zygote everything already to avoid it coming up much...)

> if i had my time again, i'd be
> tempted to make everything in /bin only accessible to the shell,
> because tbh most of what i've seen apps do is very stupid! although
> there's selection bias there: "why would i even be looking at what an
> app's doing if it isn't doing something wrong/stupid?".)

A more posix-like programming environment doesn't strike me as a bad thing, but
I'm biased. :)

Debian not having /sbin in non-root users' $PATH is something I find personally
annoying, but also a reasonably strong precedent for saying "these commands
normal users are not expected to touch".

>> Which admittedly has a giant "apple(tm) version skew" warning in the middle but
>> I honestly have no idea how to fix that: mknodat() is a posix-2008 function:
>> https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/mknodat.html
>> Which Apple is now claiming it only added October of 2022?
>> https://en.wikipedia.org/wiki/MacOS_Ventura
>> I mean... Really? They didn't catch up to posix-2008 for FIFTEEN YEARS? Steve
>> Jobs was still alive for almost four years after that came out...
> if that surprises you ... "obviously you're not a golfer".
> don't get me started on how long we had to wait for clock_gettime().
> that alone has to be responsible for half the macos #ifdefery on the
> entire internet!

The seven year time horizon does not apply to mac, because I haven't got the
domain expertise.

>> > ```
>> > In file included from toys/posix/who.c:26:
>> > In file included from ./toys.h:8:
>> > ./generated/config.h:695:9: warning: 'CFG_TOYBOX_ZHELP' macro
>> > redefined [-Wmacro-redefined]
>> > #define CFG_TOYBOX_ZHELP 0
>> >         ^
>> > ./generated/config.h:691:9: note: previous definition is here
>> > #define CFG_TOYBOX_ZHELP 1
>> >         ^
>> Emitted into config.h twice. Odd.
>> The mac build I just did has just:
>> #define CFG_TOYBOX_ZHELP 0
>> #define USE_TOYBOX_ZHELP(...)
>> The first of which is line 693.
>> This is generated from the .config file via the giant "legacy compatible" sed
>> invocation on line 161 of scripts/make.sh, looking for "CONFIG_BLAH=y" and
>> "# CONFIG_BLAH is not set" lines, to produce the 0 and blank defines, or the 1
>> and _VA_ARGS_ defines.
>> Unless the sed hiccuped (unlikely), that says your .config has two instances of
>> the same symbol. And given that they're emitted in pairs (CFG and USE macros for
>> each symbol), there's another symbol between the redundant ones you've got. So
>> you might have something like
>> # CONFIG_ZHELP is not set
>> In your .config? (Which I don't think the old 2.6.12 kconfig plumbing is
>> _capable_ of emitting, but you might get if you manually patched your .config file?)
> yeah, that's almost certainly it. and, yes, luckily my /tmp is still
> intact, so i can confirm that:
> ```
> /tmp/toybox-help$ grep ZHELP .config
> # CONFIG_TOYBOX_ZHELP is not set
> /tmp/toybox-help$
> ```

See "always generating the headers", above.

If the objection _is_ to compiling and running C code on the target, I've
already been poking at moving help.h to sed because of the pending bug reports
about that.

While I'm there I'm tempted to strip the repeated "usage: $COMMAND " text out of
the start of every single help entry, saving a dozen or so bytes times 200+
commands, but then it doesn't show up right in the kconfig help.

I wound up doing the gzip compression instead, because repeated text is the
definition of compressible, but I still have the issue that shared
implementations with the same help text (ala md5sum/sha1sum or chgrp/chown) have
the same usage: line despite having different command names.

Properly fixing this involves replacing kconfig, which is on the todo list
anyway but WAY TOO BIG a digression right now. (Finish shell, build LFS, THEN
worry about it.)


More information about the Toybox mailing list