[Toybox] Thoughts on seperating shell dependencies and MAYFORK commands?

Tue Feb 20 17:41:56 PST 2024

On Tue, Feb 20, 2024 at 10:41 AM Rob Landley <rob at landley.net> wrote:
>
> On 2/19/24 13:40, Oliver Webb via Toybox wrote:
> > When doing "make sh", scripts/single.sh looks for MAYFORK commands to pull in as builtin's
> > Which means any command that is declared with MAYFORK is automatically included into the shell when doing "make sh".
> > TOYFLAG_MAYFORK is essentially "if we are calling this in the shell, don't fork/exec to save system resources"
>
> We already do recursively run commands in the same process. Any command can be
> called recursively when you don't disable that in the config.
>
> As long as you don't set CONFIG_TOYBOX_NORECURSE=y then any call to xexec() can
> recursively call toy_init() again and recursively call a new new command_main
> out of the toy_init dispatch table within the same process, and stuff like
> xexec() and xrun() do that automatically. (It measures the stack to see if it's
> recursed too far, so "chroot env ionice linux32 nice nohup nsenter..." will
> eventually throw an exec() in there when the stack measuring logic in
> toy_exec_which() in main.c says we've come too far, currently 24k.)
>
> The SHELL won't do it, because the shell cares fairly deeply about what is and
> isn't a child process. In fact pipelines have implicit ( ) around each entry
> because each one is a subshell to avoid pipelines blocking on full pipe buffers.
> (And yes that includes the last one in the list, for consistency. And thus "echo
> | x=47; echo $x" isn't going to remember the 47, same as bash.)
>
> But on an mmu system, fork() is like 5% of the expense and exec() is 95%.

(in case you ever get quoted out of context, i'll add "for a process
like toybox that doesn't have threads, large numbers of VMAs, or large
numbers of open fds" here. because those things do make fork()
expensive in large systems. but, no, not for toybox.)

> I
> actually benched this at one point, and although it "oh goddess I'm old" many
> years ago and may have changed, the underlying design idea is if you just copy
> the memory mappings then everything is still cache hot and copy on write, so
> there isn't that much actual work that gets done, just copying some top-level
> metadata and incrementing reference counts. In fact linux famously forked()
> faster than slowaris could create threads, and that was back in 1996:
>
> https://landley.net/history/mirror/linux/kissedagirl.html
>
> And that was _before_ before the O(1) fork and exit scalability work Ingo Molnar
> did around 2002:
>
> https://lwn.net/Articles/8000/
>
> No, fork() should be fast, it's exec that takes way more time loading/parsing
> file data, traversing ELF tables, setting up and populating memory mappings...
> (And then doing it multiple _more_ times for shared libraries: static linking
> $PATH sped up builds 20% in my testing. Again, a while ago, and part of that may
> have been a qemu dyngen artifact in my old test environment...)
>
> Both NOFORK and MAYFORK are for the shell. NOFORK means it can ONLY run in the
> shell's PID called as a function and with access to the shell's data structures.
> MAYFORK means it can run as its own process or within the shell (if toys.rebound
> is NULL we're standalone, if not we're running in the shell)
>
> NOFORK means:
>
> 1) Don't show up in the "toybox" command list, so install doesn't create a
> symlink for it. (Having "cd" or "export" as a standalone command would be
> pointless.) Conversely, ONLY show the nofork and mayfork commands when "help" is
> run with no arguments within the shell.
>
> 2) If you don't fork() and the command gets called as a function from within the
> shell's PID, then A) you can't ctrl-z suspend it, B) it has to clean up all its
> memory and filehandles EVEN IN ERROR PATHS or else it'll leak resources over
> time in shell scripts.
>
> Generally MAYFORK commands are commands available from the $PATH, but which ALSO
> have extra features when called from within the shell. For example, /bin/kill
> exists but "kill %1" only makes sense when you have access to the shell's job
> control structures to know what jobspec %1 is. So calling it as a builtin
> behaves slightly differently than calling it standalone.
>
> I haven't worked out how to make the two cases show different help text yet, but
> it's on the todo heap as part of the help plumbing and kconfig redo that needs
> to happen at some point.
>
> > There is a pretty large distinction between "I'd like this to be automatically
> > put in the shell when doing 'make sh'"
> > and "I'd like to have this be used by the shell instead of forking if it's
> > in the same toybox binary as it"
>
> I'm not seeing the importance of the distinction. Commands annotated like that
> have _extra_behavior_, it's just not a performance thing. They can do things
> they couldn't do if they didn't have access to the shell's data structures.
>
> There are a couple exceptions, like true/false which are SO cheap that the extra
> overhead from fork() is noticeable in "while true; do blah; done" loops. And
> "echo" is historically a bash builtin so a standalone shell on a system with no
> $PATH might want that. But as you've pointed out, even cat didn't get that
> annotation.

(fwiw, mksh _does_ have a builtin cat. that's something i disabled on
Android because it was confusing to users. for "fully toybox" systems
though, with just one binary and loads of symlinks, you wouldn't be
able to tell the difference as easily. though iirc it was weird
signal-related behavior that was the first time someone noticed cat
was a builtin?)

> (But mostly I didn't have to do any extra work to make sure they
> didn't leak resources in any of their error paths, so it was really cheap to
> slap MAYFORK on them. Even "cat" needs to make sure the input filehandle gets
> closed when do_cat() calls xputc() and it notices that stdout is a closed pipe
> and calls longjmp() from xexit() without ever returning. It would need to write
> the filehandle into GLOBALS() with a sigatexit() error handler that closed it,
> adding cleanup code with a nonzero size.)
>
> > Commands like cat, ls, mv, cp, du, find, rm, etc would benefit by being MAYFORK commmands,
>
> Not really, once you're traversing directory entries the overhead of fork() is
> pretty thoroughly amortized.
>
> > But it also would not make sense to automatically include them into a single
> > command binary of the shell.
>
> You can include or exclude arbitrary binaries with "make menuconfig", except for
> the toysh builtins in sh.c which have USE_SH() around their OLDTOY() macros
> because being able to chop "exit", "source", or "exec" out of the shell is a
> fairly major API change.
>
> (One big design difference between busybox and toybox is I decided "how does the
> toybox command $BLAH behave" should have a consistent answer.)
>
> > The solution to this, that would give a multicommand binary with a shell the
> > ability to run faster by not forking off
> > processes.
>
> Seriously, benchmark it. The expense isn't clone(2), it's execve(2). (Now maybe
> selinux nonsense makes that go weird, couldn't say...)

(as a libc maintainer, "the expense is all in your ELF constructors"
:-) last i looked, most of the "wasted" time in a trivial toybox
invocation on Android is some front-loaded work in the networking code
that makes sense for the zygote but would be unfortunate for someone
running lots of random unix commands!)

> > And a single command binary of the shell the ability to have commands like ':'
> > without pulling in things like
> > find or rm,
>
> You can do that now? You're talking about reclaiming the current state?
>
> I note that "make sh" is calling scripts/single.sh which creates a temporary
> .config with sed trickery that has "sh" special cased:
>
>   if [ "$i" == sh ]
>   then
>     DEPENDS="$($SED -n 's/USE_\([^(]*\)(...TOY([^,]*,.*TOYFLAG_MAYFORK.*/\1/p'
> toys/*/*.c)"
>   else
>     MPDEL='s/CONFIG_TOYBOX=y/# CONFIG_TOYBOX is not set/;t'
>   fi
>
> Note: sh is the only standalone command with the multiplexer enabled, which
> means it cares about its command name, which means:
>
> $ make sh
> ...
> $ mv sh toybox
> $ ./toybox
> [ bash echo false help kill printf pwd sh test time toysh true ts
>
> Whereas for all the other standalone commands:
>
> $ make tty
> ...
> $ mv tty toybox
> $ ./toybox
> /dev/pts/242
>
> > would be to create 2 flags with the same value, and only scan for one in scripts/single.sh.
> > I.e. changing the existing MAYFORK declarations to something else (TOYFLAG_SHELLDEP, maybe TOYFLAG_BUILTIN?).
> > Scanning for _that_ instead of MAYFORK in scripts/single.sh, and adding a declaration of it in lib/toyflag.h.
> >
> > Thoughts? I already have this working, there isn't any build infrastructure I know of that breaks when you do this,
> > and the only reason I am not sending a patch yet is because I dunno a actual good name for the flag
>
> There's some backstory here:
>
> http://lists.busybox.net/pipermail/busybox/2006-February/052626.html
> http://lists.busybox.net/pipermail/busybox/2006-March/053332.html
> http://lists.busybox.net/pipermail/busybox/2006-May/055203.html
> http://lists.busybox.net/pipermail/busybox/2009-January/068150.html
>
> re: that last one, kerneltrap went down but
> https://web.archive.org/web/20090615000000*/http://kerneltrap.org/node/517 has it.
>
> Rob
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net