[Toybox] CONFIG_TOYBOX_ZHELP
Rob Landley
rob at landley.net
Tue Jan 30 16:32:53 PST 2024
On 1/30/24 16:02, enh wrote:
> Would it help if I pulled out "mkconfig.sh", "mkflags.sh", "mkglobals.sh",
> "mkhelp.sh", and "mknewtoys.sh" from make.sh and had the top level script call
> those?
>
> tbh, it the fact that stuff keeps moving around that makes it easier for me to
> just check in generated files. if/when it gets to the point where you haven't
> touched this stuff in a couple of years, _that's_ when it might make sense to
> move over :-)
Understandable, but from my point of view lots of it _hasn't_ moved in forever.
The sed invocation I just replaced was still the version I'd added to busybox
back in like 2006.
But the file being one big hairball means when I change any part of it, yes it
changes.
There were 5 commits to it since the start of 2023, the most recent of which was
me rewriting that sed invocation on sunday, and the one before on January 10
added the new gzip help text header.
Before that, we jump back nine months to last March, those two commits fixing
macos performance issues (their bash doesn't have -n, and a commit switching
loop-calls-sed several times to one big "sed */*.c" because homebrew command
startup latency is enormous.)
The change before that (January a year ago) was adding || true at the end and a
comment "# Ensure make wrapper sees success return code".
And then we're into 2022.
Only the most recent change affected how any of those 5 headers were generated.
(One commit did add a sixth, but you don't use it.)
That said, I have some pending changes _now_ (like that flags.h thing and
increased parallelism), because you said you weren't using it. :)
> Four of the 6 headers are honestly just echo+sed invocations. And given even the
> config.h generation is using $SED (I.E. gsed) instead of "sed", I should just:
>
> sed 's/^# CONFIG_\(.*\) is not set.*/#define CFG_\1 0\n#define
> USE_\1(...)/;T;s/CONFIG_\(.*\)=y.*/#define CFG_\1 1\n#define USE_\1(...)
> __VA_ARGS__\n/;T;d' .config
>
> (despite having looked it up last time i tried to understand this stuff, i still
> don't remember what T means.
I find toybox's "sed --help" to be WAY easier to find stuff in than any of the
gnu crap, or posix. (How the gnu man pages manage to be a teaser advertisement
for their info pages... HOW DO YOU REIMPLEMENT CAPITALISM WHILE FIGHTING AGAINST
IT? You have become what you fought against. You can stop now.)
The two t/T commands are "test and jump". As with all sed jumps, if you don't
give it a label it jumps to the end of the script (advancing to the next line of
input). Lower case is "test+jump if last s/// matched" and upper case is
"test+jump if last s/// did NOT match".
The above script was written off the top of my head and hadn't been tested, and
I got the test backwards (should have been lower case t rather than upper case),
but it was just an unnecessary optimization and the one I checked in didn't have
the T at all, just:
# Rebuild config.h from .config
$SED -En $KCONFIG_CONFIG > "$GENDIR"/config.h \
-e 's/^# CONFIG_(.*) is not set.*/#define CFG_\1 0\n#define USE_\1(...)/p' \
-e 's/^CONFIG_(.*)=y.*/#define CFG_\1 1\n#define USE_\1(...) __VA_ARGS__\n/p'\
|| exit 1
Two s/// search-and-replace commands, one to match "^# CONFIG..." and replace
with two #define lines, and one to match "^CONFIG..." and replace with two
#define lines. (I don't need a 't' because the second search will never match on
the first search's output.)
That phrasing even more-or-less stays in 80 columns. :)
> that's probably something mostly known to people
> who've implemented their own sed twice. i mean, BSD/macOS seds don't even know
> what it means :-) )
Lower case t is posix. Upper case T is a gnu/dammit extension. You can always
phrase your script with JUST the test-if-matched from posix, but you wind up
sticking in extra labels ala:
s/potato/ardvark/
t skip
b
:skip
s/walrus/ocelot/
b means "branch", it's an unconditional jump-to-label. Again with no label, it
jumps to the end of the script.
Yes, that kind of nonsense was why the original sed script was so long. Using
ONLY posix features you can still do everything, but it's really verbose.
> Weaning help.h off of C is something I've been working towards, because the
> design idea behind the current version was sub-options need to be stitched
> together, and now I'm going "maybe some sort of ${SUBOPT}" escape syntax to let
> one command know when/where to block copy in another command's help text?
> (Because the -Z aren't going away. They're not SELECTABLE, but they're THERE.)
>
> yeah, i'd wondered about that exact same idea. seems like it would help with the
> md5sum-type duplication too, if you could just "#include" another command's help
> in all the same-interface-different-name commands' help.
Not another command's help, but another config option's help.
The main thing stopping me from trying to generate usage: lines from NEWTOY()'s
optstr is that doesn't know what any of the named arguments are called, ala:
usage: sed [-inrszE] [-e SCRIPT]...|SCRIPT [-f SCRIPT_FILE]... [FILE...]
It doesn't know that -e is "SCRIPT" or that the varargs are "FILE".
And in that instance, it doesn't know that the first argument is an implicit -e
argument if none are specified. (Although that's a common enough pattern I could
add a marker for it, grep does it too...)
I've got a bunch of todo items like this where I should go through and try to do
a thing, but I just spent multiple days rewriting grep -w because I'd missed
some corner cases. :)
> (fwiw, unless you're really anal about every last help byte -- which i don't
> think you are, plus you have compression now -- i personally quite like the
> coreutils option of just having -Z all the time, but on some systems it just
> prints an error message. similar to the old
> https://en.wikipedia.org/wiki/Bruce_Tognazzini advice for GUIs about not
> hiding or even disabling invalid options --- have everything "doable" all the
> time, and explain to the user why it's not currently valid if they use it when
> [in most GUIs] it would have been greyed out.)
Currently the ls --help _does_ have -Z all the time.
In this case "greyed out" would be something like:
-p put '/' after dir names -q unprintable chars as '?'
-R recursively list in subdirs -s storage used (in --block-size)
[-Z security context]
output formats:
-1 list one file per line -C columns (sorted vertically)
But that's probably more trouble than it's worth (and not obvious enough).
> The design questions of what the escapes should look like
>
> heh, the reason i don't think i'd mentioned this idea to you was that i thought
> it would be less likely to end up a bikeshed ... i'm happy to pretend to have a
> strong opinion if it gets you out of the
> https://en.wikipedia.org/wiki/Buridan%27s_ass problem :-)
Butterfly effects get us out of that.
I have a zillion things like this on my todo heap, which aren't necessarily
hard, the problem is closing tabs and getting to 1.0. I have too many currently
open cans of worms, need to clear work space...
> Redoing mkflags with shell script is unlikely to happen soon, in part because
> that one DOES vary by .config. Although...
>
> #define FLAG_x ((FORCED_FLAG|CFG_COMMAND)<<1)
Although some of these seem self-contained-ish enough I'm tempted to just take a
swipe at them from time to time.
Which is how I wind up with more tabs...
> Making the OPTSTR invariant is also tricksy, but I can pull out a very OLD trick
> which is that when I first added USE() macros to busybox I also had SKIP()
Question of whether it's worth it for just that use. I could also have runtime
"jump over this" annotations in the string inside a USE() macro to be parsed by
lib/args.c, ala
NEWTOY(blah, "abc" USE_POTATO('\003')"def" "ghi", FLAGS)
But... ew? Maybe an encapsulating macro with wax-on wax-off indicators before
and after a substring? (NO THEY WOULD NOT NEST. Ew.)
Sigh, I know it's possible to get a macro to expand to another macro that's
evaluated in sequence, but I have to relearn how each time. I _think_ the
abc##def glue together syntax can expand to a macro that gets evaluated? I
_think_ I've done it before...
> Ahem, AFTER a release.
What he said.
> Alas, when SMP was invented, the compiler did NOT get extended to automatically
> fork off sub-processes for each .c so "cc *.c -o thingy" would naturally take
> advantage of SMP. Instead they taught MAKE to do it, which was just wrong.
>
> (i'm not sure what part of "do the easy thing" and "unix" you don't think go
> together. and to be fair, i've seen a lot of compilers for several different
> languages try to move parallelism into the compiler with relatively little
> success.
My objection is adding -j to make was not LESS work than adding -j to cc.
And it doesn't even require threading, it can be multiple processes. They
already have a "pipe the cc output to as" logic in existing compilers...
> it's harder than it sounds, especially if you're expecting speedups
> anywhere close to what you get from the external parallelism.)
Back when I had a wrapper to rewrite the gcc command line to use uclibc via
--no-stdinc --no-stdlib and then explicitly adding the various search paths and
implicit .o files for the type of build you were doing:
https://github.com/landley/aboriginal/blob/master/sources/toys/ccwrap.c
I was considering adding -j support to it. But existing builds didn't offer an
obvious cc *.c invocation, you had to reverse engineer it from a giant makefile
doing irrelevant crap. And their "what to include/exclude" logic was external
not #ifdef based.
It would not have been hard to go a different way back in the day. But alas, BSD
was allowed to implement networking so "everything is a file" became "networking
is a filehandle but there's no /dev/eth0 instead there's a magic syscall", and
the gnu clowns extended "make" in alarming ways because AT&T vs BSDi sidelined
both and the suits at Sun and AIX profoundly did not grok the scene...
*shrug* Water over the bridge. We wave bye-bye to the bridge as it breaks up and
washes downstream, and move on...
Rob
More information about the Toybox
mailing list