<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jan 19, 2024 at 5:49 PM enh <<a href="mailto:enh@google.com">enh@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Fri, Jan 19, 2024 at 10:13 AM Rob Landley <<a href="mailto:rob@landley.net" target="_blank">rob@landley.net</a>> wrote:<br>
><br>
> On 1/16/24 19:22, enh wrote:<br>
> > On Sat, Jan 13, 2024 at 12:38 PM Rob Landley <<a href="mailto:rob@landley.net" target="_blank">rob@landley.net</a>> wrote:<br>
> >> On 1/12/24 14:25, enh via Toybox wrote:<br>
> >> > thanks for keeping the uncompressed path!<br>
> >><br>
> >> Always the plan. Besides, if the binary lives on something like squashfs,<br>
> >> decompressing it twice just wastes CPU. And the standalone command builds<br>
> >> wouldn't really benefit either without a shared lib/lib.so.<br>
> >><br>
> >> (Sigh. You're going to want a shared toylib.so, aren't you...)<br>
> ><br>
> > why? we use the multicall binary with symlinks.<br>
> ><br>
> > (the most interesting question i have in that area is "should we have<br>
> > _two_ binaries with different selinux labels, so we can differentiate<br>
> > 'available to apps' and 'available to adb shell'?", but that's a bunch<br>
> > of work i'm not sure anyone will ever have time to do,<br>
><br>
> Creating the binaries isn't a big deal, it's just two .config files. I couldn't<br>
> speak to the selinux labels and whatever $PATH changes pull in the second<br>
> directory of symlinks on the android side.<br>
><br>
> I'm assuming the problem here is Android's policy of snapshotting the<br>
> "generated" directory instead of allowing a shell script to call sed to<br>
> regenerate the files. What is the actual policy/objection?<br>
<br>
it's that i don't want to duplicate your build system in our build<br>
system. i _can_ write a "genrule" that calls things (including c<br>
programs we've compiled), and do so in cases like<br>
<a href="https://cs.android.com/android/platform/superproject/main/+/main:external/one-true-awk/Android.bp" rel="noreferrer" target="_blank">https://cs.android.com/android/platform/superproject/main/+/main:external/one-true-awk/Android.bp</a><br>
say, but the toybox stuff is orders of magnitude more complicated than<br>
that.<br>
<br>
> My _theory_ is you don't want to compile external C code and run it on your<br>
> build server for security reasons.<br>
<br>
no, there are rules against checking in "blobs" of any kind, but the<br>
assumption is that imported source has been reviewed anyway. (that's<br>
why two googlers have to +2 a change written and uploaded by an<br>
outsider now --- you'd have to corrupt me _and_ someone else to get<br>
your dodgy change in. it's also why only googlers can kick off a<br>
presubmit build.)<br>
<br>
> Which is understandable if so, but does that<br>
> mean you patched all the "HOSTCC" calls out of the linux kernel build?<br>
<br>
unlike Android proper, which is no longer investigating bazel, the<br>
kernel build fully switched to bazel, and doesn't use the upstream<br>
build at all. (but there's a whole team working on the kernel, whereas<br>
toybox is just me on friday afternoons, but only when i don't have<br>
anything more urgent.)<br>
<br>
> My android toybox checkout is a bit stale (November 10) and this airport hasn't<br>
> got internet, but the android/linux/generated I've got lying around has:<br>
><br>
> config.h flags.h globals.h help.h newtoys.h tags.h<br>
><br>
> (Query: why does your .gitignore not have "generated" instead of 10 different<br>
> things under generated? Also, why "change/" but "/kconfig"?)<br>
<br>
no idea. <a href="https://android-review.googlesource.com/c/platform/external/toybox/+/2919134" rel="noreferrer" target="_blank">https://android-review.googlesource.com/c/platform/external/toybox/+/2919134</a><br>
brings it mostly back in line. changing how we do the .config-$OS<br>
stuff is a bit ambitious for 17:23 on a friday evening, but that<br>
simpler change should remind me to look at that too next week...<br></blockquote><div><br></div><div>...or the week after that. done: <a href="https://android-review.googlesource.com/c/platform/external/toybox/+/2938299">https://android-review.googlesource.com/c/platform/external/toybox/+/2938299</a> means we no longer have local .gitignore changes (but does mean that you'll break things if you ever add files called config-device, config-linux, or config-mac :-) ).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> Two of those headers, flags.h and help.h, are washed through C code. The rest<br>
> (config.h, globals.h, newtoys.h, tags.h) are all created by echo and sed.<br>
><br>
> I note that config.h is _always_ rebuilt from .config by scripts/make.sh<br>
> (presumably overwriting your snapshot version)<br>
<br>
we don't run scripts/make.sh --- we build everything directly via soong.<br>
<br>
> because the dependency is<br>
> commented out:<br>
><br>
> #TODO: "make $SED && make" doesn't regenerate config.h because diff .config<br>
> if true #isnewer config.h "$KCONFIG_CONFIG"<br>
><br>
> I.E. config.h doesn't record _which_ .config file it was produced from, so<br>
> switching between single and full builds confused it because "this file is newer<br>
> than that file" doesn't help when "this file" is a moving target, so I just<br>
> commented out the isnewer and put "true" in there, with a TODO to do more design<br>
> work here.<br>
><br>
> While I could add a comment to config.h to say which file it was from and teach<br>
> the script to parse that comment... that's brittle and ugly. An elegant fix<br>
> _removes_ complexity, and my todo item here is actually "try to parallelize<br>
> header file creation and always do it". Which I haven't because you snapshot<br>
> headers, and I need to find out why.<br>
><br>
> > and the app<br>
> > compat issues of trying to make that split would be a lot of trouble.<br>
><br>
> This I couldn't speak to. (Presumably the issue is weaning apps that<br>
> inappropriately use commands off of them, since they're no longer in the $PATH?)<br>
<br>
exactly.<br>
<br>
> What would the two pools be, anyway? It seems reminiscent of the /bin vs /sbin<br>
> split.<br>
<br>
yeah, that's been the point where i've always been unconvinced. (the<br>
canonical example is people whining that having /bin/netcat makes life<br>
easier for bad guys, which i've never really believed. saves the<br>
_white hat_ folks a tiny amount of effort when writing their PoCs,<br>
maybe, but the black hats? (a) i'd like to see the specific example<br>
where one did that, please and (b) i'd like some evidence that<br>
_that's_ an expensive part of a modern exploit chain, rather than<br>
something the malware factories just have on the shelf anyway.)<br>
<br>
personally, i'd be happier with the "apps get _no_ /bin, shell gets<br>
everything" option, but that probably requires a time machine given<br>
the app compat issues.<br>
<br>
> > i assume. i don't actually have any idea, or any good way of knowing,<br>
> > what apps are calling what toys.<br>
><br>
> I've done this already for system bootstrapping, mkroot/record-commands is a bit<br>
> overkill for this, but the technique could presumably be scaled down to set a<br>
> bit in a scoreboard or something. (I needed to know the command line so I could<br>
> reproduce/debug behavior divergences, if you just want to know which files got<br>
> execed...)<br>
<br>
oh, it's perfectly doable. but -- as you'd imagine and hope -- there's<br>
a _lot_ of paperwork and legal signoff for anything like that, and i<br>
don't think anyone's interested enough in the results to do that work.<br>
<br>
> Or if I get the strong/weak symbol changes in, a wrapper around toy_singleinit()<br>
> or similar could live in lib/portability.c and do extra setup before/after<br>
> calling the original. Although the more logical thing to do THERE might be to<br>
> have bionic's dynamic linker do it so you could log ALL executable launches.<br>
> (Fire off a thread to record it and it shouldn't add measurable latency on an<br>
> SMP system, plus exec isn't _that_ common and already fairly expensive as<br>
> operations go. You zygote everything already to avoid it coming up much...)<br>
><br>
> > if i had my time again, i'd be<br>
> > tempted to make everything in /bin only accessible to the shell,<br>
> > because tbh most of what i've seen apps do is very stupid! although<br>
> > there's selection bias there: "why would i even be looking at what an<br>
> > app's doing if it isn't doing something wrong/stupid?".)<br>
><br>
> A more posix-like programming environment doesn't strike me as a bad thing, but<br>
> I'm biased. :)<br>
<br>
me too. my only interest in "apps get nothing" would be minimizing the<br>
app compat issues of _toybox_ changes. though "luckily" most of the<br>
uses i've seen are stupid enough to be unlikely to be affected by any<br>
plausible toybox behavioral/syntax changes.<br>
<br>
> Debian not having /sbin in non-root users' $PATH is something I find personally<br>
> annoying, but also a reasonably strong precedent for saying "these commands<br>
> normal users are not expected to touch".<br>
><br>
> >> Which admittedly has a giant "apple(tm) version skew" warning in the middle but<br>
> >> I honestly have no idea how to fix that: mknodat() is a posix-2008 function:<br>
> >><br>
> >> <a href="https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/mknodat.html" rel="noreferrer" target="_blank">https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/functions/mknodat.html</a><br>
> >><br>
> >> Which Apple is now claiming it only added October of 2022?<br>
> >><br>
> >> <a href="https://en.wikipedia.org/wiki/MacOS_Ventura" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/MacOS_Ventura</a><br>
> >><br>
> >> I mean... Really? They didn't catch up to posix-2008 for FIFTEEN YEARS? Steve<br>
> >> Jobs was still alive for almost four years after that came out...<br>
> ><br>
> > if that surprises you ... "obviously you're not a golfer".<br>
> ><br>
> > don't get me started on how long we had to wait for clock_gettime().<br>
> > that alone has to be responsible for half the macos #ifdefery on the<br>
> > entire internet!<br>
><br>
> The seven year time horizon does not apply to mac, because I haven't got the<br>
> domain expertise.<br>
<br>
well, the nice thing about mac users is that 90% of them will be<br>
running the shiniest thing within 6 months. (and not just OS versions:<br>
i've been amazed how quickly they upgraded to arm64 machines too.) the<br>
trouble is how long it takes before Apple adds a thing.<br>
<br>
> >> > ```<br>
> >> > In file included from toys/posix/who.c:26:<br>
> >> > In file included from ./toys.h:8:<br>
> >> > ./generated/config.h:695:9: warning: 'CFG_TOYBOX_ZHELP' macro<br>
> >> > redefined [-Wmacro-redefined]<br>
> >> > #define CFG_TOYBOX_ZHELP 0<br>
> >> > ^<br>
> >> > ./generated/config.h:691:9: note: previous definition is here<br>
> >> > #define CFG_TOYBOX_ZHELP 1<br>
> >> > ^<br>
> >><br>
> >> Emitted into config.h twice. Odd.<br>
> >><br>
> >> The mac build I just did has just:<br>
> >><br>
> >> #define CFG_TOYBOX_ZHELP 0<br>
> >> #define USE_TOYBOX_ZHELP(...)<br>
> >><br>
> >> The first of which is line 693.<br>
> >><br>
> >> This is generated from the .config file via the giant "legacy compatible" sed<br>
> >> invocation on line 161 of scripts/make.sh, looking for "CONFIG_BLAH=y" and<br>
> >> "# CONFIG_BLAH is not set" lines, to produce the 0 and blank defines, or the 1<br>
> >> and _VA_ARGS_ defines.<br>
> >><br>
> >> Unless the sed hiccuped (unlikely), that says your .config has two instances of<br>
> >> the same symbol. And given that they're emitted in pairs (CFG and USE macros for<br>
> >> each symbol), there's another symbol between the redundant ones you've got. So<br>
> >> you might have something like<br>
> >><br>
> >> CONFIG_ZHELP=y<br>
> >> CONFIG_WALRUS=y<br>
> >> # CONFIG_ZHELP is not set<br>
> >><br>
> >> In your .config? (Which I don't think the old 2.6.12 kconfig plumbing is<br>
> >> _capable_ of emitting, but you might get if you manually patched your .config file?)<br>
> ><br>
> > yeah, that's almost certainly it. and, yes, luckily my /tmp is still<br>
> > intact, so i can confirm that:<br>
> > ```<br>
> > /tmp/toybox-help$ grep ZHELP .config<br>
> > CONFIG_TOYBOX_ZHELP=y<br>
> > # CONFIG_TOYBOX_ZHELP is not set<br>
> > /tmp/toybox-help$<br>
> > ```<br>
><br>
> See "always generating the headers", above.<br>
><br>
> If the objection _is_ to compiling and running C code on the target, I've<br>
> already been poking at moving help.h to sed because of the pending bug reports<br>
> about that.<br>
><br>
> While I'm there I'm tempted to strip the repeated "usage: $COMMAND " text out of<br>
> the start of every single help entry, saving a dozen or so bytes times 200+<br>
> commands, but then it doesn't show up right in the kconfig help.<br>
<br>
(i'd be more interested in fixing the issues with repeated usage<br>
lines/multiple toys that should share the same text --- they're much<br>
more user-visible.)<br>
<br>
> I wound up doing the gzip compression instead, because repeated text is the<br>
> definition of compressible, but I still have the issue that shared<br>
> implementations with the same help text (ala md5sum/sha1sum or chgrp/chown) have<br>
> the same usage: line despite having different command names.<br>
<br>
(heh, yeah, you beat me to it :-) )<br>
<br>
> Properly fixing this involves replacing kconfig, which is on the todo list<br>
> anyway but WAY TOO BIG a digression right now. (Finish shell, build LFS, THEN<br>
> worry about it.)<br>
><br>
> Rob<br>
</blockquote></div></div>