[Toybox] Shell Compatibility Reports from Oils - ~800 tests passing
Rob Landley
rob at landley.net
Sun Jun 29 15:18:48 PDT 2025
On 6/28/25 23:18, Andy Chu wrote:
> Hm I looked at the goals of toybox again:
>
>> Toybox's main goal is to make Android self-hosting by improving Android's command line utilities so it can build an installable Android Open Source Project image entirely from source under a stock Android system.
>
>> Toybox aims to provide one quarter of a theoretical "minimal native development environment"
>
>> In theory, this should only require four packages
>
> I don't know much about Android -- is this at all realistic for FIVE
> packages -- if you add mksh, which I believe is the Android system
> shell ?
Eh, define realistic. AOSP is built around git (kind of conceptually),
and their build infrastructure uses python 3. So you'd build a system to
build a system.
Aboriginal Linux had 7 packages: linux, busybox, uclibc, gcc, binutils,
make and bash. and could built Linux From Scratch under the result in a
fully automated target-independent fashion using
https://landley.net/aboriginal/control-images/
Ok,
https://github.com/landley/control-images/tree/master/images/lfs-bootstrap/mnt
cheated slightly with one extra package, as
https://github.com/landley/control-images/blob/master/images/lfs-bootstrap/download.sh
attests. But https://landley.net/aboriginal/mirror/gettext-stub-1.tar.gz
was a tiny little thing to stub out some gnu/stupid, stub versions of a
dozen internationalization functions that all either returned their
first argument, NULL, or "C". The header could have been a here document
and then an empty .a file to satisfy gnu builds that insisted on pulling
in the library.
As for getting mkroot to do what aboriginal linux used to, I have no
interest in testing mksh beyond not breaking Android's use of the toybox
test suite (which runs it under mksh).
The AOSP build is large and has a lot of other dependencies, but
Elliott's been doing what he calls "hermetic builds" where AOSP tries to
provide a lot of its build prerequisites as shipped binaries, and Toybox
provides a lot of those. (Search for the world "hermetic" in toybox's
news.html page, it's been mentioned with links a few times.)
The https://landley.net/toybox/roadmap.html#dev_env section of the
toybox roadmap is my old dependency list that Aboriginal Linux needed to
rebuild itself under itself, and then build Linux From Scratch under the
result. But it's been a moving target. I regression test kernel builds
with mkroot each release. It uses the same "airlock step" that
aboriginal had, where the build $PATH is replaced with a single
directory with all the binaries the build needs before building the
packages:
https://github.com/landley/toybox/blob/master/mkroot/mkroot.sh#L54
The airlock is mostly set up by toybox's "make install_airlock" target
which uses a PENDING and TOOLCHAIN command list, the first being
commands that toybox should eventually provide (but doesn't yet) and the
second being commands the host needs to provide (mostly the compiler):
https://github.com/landley/toybox/blob/master/scripts/install.sh#L105
Currently PENDING has: expr git tr bash sh gzip awk bison flex make ar
(All but bison, flex, and make have semi-complete "pending" versions in
toybox.)
And TOOLCHAIN has: as cc ld objdump bc gcc
And the last two of those I have patches to remove the need for from the
kernel build,
https://landley.net/bin/mkroot/0.8.12/linux-patches/0004-Replace-timeconst.bc-with-mktimeconst.c.patch
and
https://landley.net/bin/mkroot/0.8.12/linux-patches/0001-try-generic-compiler-name-cc-before-falling-back-to-.patch
respectively.
My tool to instrument a build so I can see every command line called out
of the $PATH is currently mkroot/record-commands (which builds
toys/example/logpath.c), and descends from the "command logging wrapper"
described in https://landley.net/aboriginal/FAQ.html#debug_logging
(This doesn't catch the ones called from absolute paths, usually by
scripts with #!/usr/blah at the start. Also gmake will call /bin/sh
(instead of sh out of the $PATH) unless you set SHELL, see
https://www.gnu.org/software/make/manual/make.html#Choosing-the-Shell
for the gnu/stupid du jour.)
> Can Android even be built on Android at all, with any number of
> packages? e.g. if you download all the dev tools onto an Android
> device ... I imagine it is a ton of tools, and not very fun.
That's an Elliott question, and they did some sort of container
infratructure (which may or may not be related to
https://www.youtube.com/watch?v=Eu-rqMHqM6I ) in newer versions of
Android than my phone runs, which can presumably install arbitrary linux
distros in either containers or VMs, so it's a semi-philosophical
question? (But the countering trusting trust stuff still applies.)
I've been working _towards_ it since 2011, but... let's just say the
past decade has not provided my ideal work environment.
> Anyway, if there is something realistic we could do here with OSH,
> that may be of interest to our funders http://nlnet.nl
https://github.com/landley/toybox/blob/master/toys/pending/sh.c has most
of the infrastructure in place already. If I wanted to use bash or mksh
in another aboriginal linux style LFS build setup, I could. (And Alpine
Linux exists, which benefited from all the busybox work I did back in
the day.)
> e.g. testing that important packages can actually be built, and
> reducing real failures to reproducible test cases. That is a lot of
> real work
Which Alpine has presumably done. I'm not trying to patch packages, I'm
using them as test cases. Which is how you wind up with stuff like:
https://github.com/landley/toybox/commit/32b3587af261
Which is CLEARLY THEIR BUG, yet we must cope.
> From some viewpoints it could be theoretical, but proving that you can
> build a real system is important!
I've done it. The old "lfs-bootstrap" images in
https://landley.net/aboriginal/downloads/old/binaries/1.4.1/extras/ were
"here's the linux from scratch 6.x root filesystem that built under qemu
from the minimal native development environment system image this release".
I got FANCY back then. If you're wondering why the (current) airlock
scripts detect multiple instances of the same command in the $PATH and
symlink them into numbered fallback directories, it's for things like
distcc, which the old scripts used to move the heavy lifting of
compilation out of the emulated environment to run on the host machine:
https://landley.net/notes-2008.html#07-06-2008
I probably blathered about that at Ottawa Linux Symposium:
https://bootlin.com/pub/video/2008/ols/ols2008-rob-landley-linux-compiler.ogg
Still on the TODO list for the new stuff. Back in the day I could get
about -j3 usefully going before the emulator became the bottleneck. Well
using SMP for the actual compile part, the configure stage was 100% the
bottleneck in all the gnu package builds. Still is. More totally
unnecessary gnu/stupid: the compiler sets a zillion builtin macros you
can see with:
$ :|cc -dM -E -
And between that, c11's __has_include(), and features.h you can
eliminate almost all configure time probes because it ALREADY KNOWS.
Just set your cross compiler and let your headers pick through the
symbols to figure out what to do.
One of my many todo items is re-testing whether running ./configure with
static linked binaries is still 20% faster under QEMU these days:
https://landley.net/notes-2009.html#14-10-2009
I _think_ that was back before PLT and GOT were collated into arrays,
meaning QEMU the dynamic references were patched in-situ instead of
redirecting off an object table, so QEMUJ had to re-translate each
executable page every time it was written to (self modifying code REALLY
fscks with dynamic translation) meaning the overhead of dynamic linking
patching all the jumps in place was just pathological. Then there was
that terrible RTLD_LAZY nonsense which SOMEHOW MADE IT WORSE, and of
course SOME linking variations would always indirect off the PLT/GOT and
others would patch the relocation into the caller as part of the first
call... I think -fPIC or not was involved here somehow? (PIE is SORT of
nice, but static PIE not using the dynamic linker but STILL DYNAMIC
LINKING ITSELF means it has to be STATICALLY LINKING THE RUNTIME DYNAMIC
LINKER and that's about where I step away from the keyboard.
Don't ask me how using dlopen messes with any of that. Sigh, I keep
thinking Rich Felker's dlopen() rant is on https://ewontfix.com/
somewhere but no, it's buried in the musl openwall list which Google
can't find anymore since
https://www.wheresyoured.at/the-men-who-killed-google/
Anyway, it's been a while since I last seriously dug into linking,
because it's a can of worms.
(https://landley.net/bin/mkroot/0.8.11/linux-patches/0002-sh4-fdpic.patch
doesn't count because I actually needed it for something.)
Sigh, everything has so much backstory. QEMU having to translate pages
is a thing I blathered about back when I was trying to do a "qemu weekly
news", I explained how/why dyngen worked:
https://landley.net/qemu/2008-01-15.html#Jan_17,_2008_-_[PATCH_0_5]_Enable_building_of_op.o_on_gcc4
Right before it got ripped out and replaced:
https://landley.net/qemu/2008-01-29.html#Feb_1,_2008_-_TCG
But the general principles still apply. (SO MUCH of computer science is
"we learned how the principles worked from some old obsolete thing
that's been replaced, and the new one still works fundamentally the same
way but it's a lot more complicated so you can't actually SEE that
unless you understand where it came from. It's a pedagogical disaster
leading to
https://www.landley.net/history/mirror/institutional_memory.html loss
and I dunno what to do about that, but what else is new?)
> Andy
Rob
More information about the Toybox
mailing list