[Toybox] Shell Compatibility Reports from Oils - ~800 tests passing

Fri Jun 27 09:05:21 PDT 2025

On 6/26/25 23:33, Andy Chu wrote:
> Hello!
> 
> I compared many shells for bash compatibility, including toysh:
> 
> https://pages.oils.pub/spec-compat/2025-06-26/renamed-tmp/spec/compat/TOP.html

Ooh, more tests! Hmmm, how would I reproduce that myself here...

https://github.com/oils-for-unix/oils/wiki/Spec-Tests

Not exactly a HOWTO.

> toysh passes 809 tests, out of ~2200:
> 
> https://pages.oils.pub/spec-compat/2025-06-26/renamed-tmp/spec/compat/PASSING.html

Currently toysh doesn't even pass its OWN test suite if you enable the 
$BROKEN tests, and my sh.txt file has another 400 lines of tests I need 
to add to that. (I should break down and add command line editing and 
history so I can start on !1 and friends, but if the user interface is 
polished people assume the engine is done, and it's very much not yet.)

I'm working on it. :)

> What's the motivation for this?  Well, two months ago, I thought that
> toysh and OSH (part of https://oils.pub/ ) were the only projects
> aiming for bash compatibility.

I guarantee you some rust zealot out there is doing a rust 
implementation for literally no other reason than to write it in rust.

> But then I learned there are TWO shells in Rust aiming at bash
> compatibility, both started in 2022 - in this thread
> https://news.ycombinator.com/item?id=43908368

Yup, there you go. Two of them.

> So I thought you may be interested in this.

Not really. I've never met a rust developer who had any argument for 
rust other than "I hate C" and "writing more C is a SIN you HEATHEN".

Which doesn't explain why to use _that_ language for kernels instead of 
instead of go, swift, zig... And that's kernels, why not do userspace in 
any of the bytecode languages? (I personally like Lua.)

Bash exists. I'm trying to do a self-contained project so I can get a 
minimal system down to 4 packages (compiler, kernel, libc, cmdline) so 
you can build something like tinycc+linux+musl+toybox and then build 
Linux From Scratch under the result.

http://lists.landley.net/pipermail/toybox-landley.net/2020-July/011898.html

I've been working on that general idea for quite a while. My first go at 
it (that got me into busybox development in the first place) replaced 20 
packages with busybox:

https://landley.net/aboriginal/old/

And my second attempt I got I got the whole self-bootstrapping system 
capable of building Linuxx From Scratch and Beyond Linux From Scratch 
under itself down to 7 packages (linux, busybox, uClibc, gcc, binutils, 
gmake, bash):

https://landley.net/aboriginal/about.html#design

(From which things like Alpine could build a busybox-based distro you 
could do real work in.)

If I'm going to get it down to 4 packages (with no other external 
dependencies, not even zlib or ncurses), then such a command line needs 
a shell, and if it's going to have a shell then bash is the logical 
model for that shell. But note how the logic starts with "toybox needs a 
shell" not "there's something wrong with the existing bash".

> A few years ago I transcribed some tests from toybox, so thanks for that:
> 
> https://pages.oils.pub/spec-compat/2025-06-26/renamed-tmp/spec/compat/toysh.html
> 
> https://pages.oils.pub/spec-compat/2025-06-26/renamed-tmp/spec/compat/toysh-posix.html

You're welcome. Glad you found them useful. :)

> ---
> 
> I've contacted the authors of the other shells as well -- there seems
> to be a lot of duplicated effort!

https://xkcd.com/927/

I've personally reimplemented basically the entire Linux command line 
TWICE. (When I left busybox I'd written about 1/3 of it.)

Duplicated effort is a FEATURE of Linux, that's why it made open source 
development work better than other projects. (I gave a talk on this at 
Flourish in 2010, "the prototype and the fan club".)

https://landley.net/talks/flourish-2010.txt

Linux being modular (interchangeable parts available from multiple 
sources, like old pre-laptop white box PCs) was the main reason* Linux 
beat out BSD. BSD was a big monolith that had userspace in the same SVN 
repo as the kernel. So you could mix and match parts in Linux in a way 
that BSD strongly resisted. When Linux switched from libc5->libc6 that 
was a distro choice. Busybox couldn't have happened under BSD, because 
you'd have to fork the kernel in order to replace "cat".

* Well, after about 1995. Before that BSD required a hardware FPU and 
Linux didn't so a lot of cheap PC hardware was unable to run BSD during 
the big rush when the NSF changed the internet AUP in 1993 to allow 
for-profit ISPs. Nobody had a budget for "web server" so they fished old 
discarded 386 PCs out of closets to be web servers with Linux+Apache, 
giving Linux a critical mass of users. Before that there was about 5 
years of the legal uncertainty around AT&T vs BSDi that gave Linux its 
first foot in the door. And of course before the internet properly hit 
critical mass BSD got harvested for talent to work on proprietary forks 
every 5 years (Sun hiring away Bill Joy, Bill Jolitz's work being taken 
away from him, Jordan Hubbard getting hired away from FreeBSD to work on 
MacOSX) costing it a LOT of momentum, something back in the day GPLv2 
did prevent. (But note how that harvesting/forking STOPPED when "my open 
source portfolio work is more important than my resume" turned into a 
thing around y2k, and it never really applied to Apache which was 
non-copyleft licensed all along, because its community was on the 
internet rather than usenet and cutting yourself off from that for a job 
wasn't the cultural norm. Part of the reason for such huge pushback 
against GPLv3 is that even GPLv2 mattered a lot less by that point, 
because license aside you'd never convince somebody like Alan Cox to cut 
himself off from the community to go work on a proprietary fork. The 
internet was not usenet.)

> The author of brush has expressed interest in using these spec tests,
> so I am thinking of turning into something more "cross project"

I'd be happy to have more tests, and we've needed a good test suite. 
You'll note that mine has "TEST_HOST=1 make test_sh" where bash passes 
all the tests.

That said, "I have _chosen_ not to pass this test" is a thing that comes 
up with external test suites, which I never quite know how to handle...

> I would even call it "Bashix" -- a superset of POSIX, that multiple
> shells could agree on -- so that users have a stable and well-defined
> language to write.

You may have seen the long threads on here with Chet Ramey, the bash 
maintainer. I've also had discussions about standards with Elliott 
Hughes the ANdroid base OS maintainer.

Alas, neither of us wants to maintain a standard because it's a TON of work.

> I've noticed for a long time that there is a LOT of behavior that
> multiple shells agree on, e.g. assignment builtins, other builtins,
> and arrays, that is not in POSIX.

See "why not just use posix for everything" in:

https://landley.net/toybox/roadmap.html#susv5

Posix is weirdly a good standard (or at least good at what it does) 
because it tries to be a minimal subset everyone can agree on. Which 
means it had holes big enough to drive Windows NT and OS/360 through. 
But even then: I did not implement sccs, ed, SUSv4's dd'd ebcdic 
conersions... Lots of my commands have "deviations from posix" sections 
in the source file's header comment block, because a plan is a frame of 
reference to diverge from.

> I'm not sure how much time / motivation various authors have, but it
> would probably be useful to coordinate.   e.g. I notice many questions
> about alias on the blog.

Because it's _INSANE_.

> OSH has mostly figured those things out,

Oh so have I now, it was just painful getting there.

And I still have a corner case I don't do right. I am WAY behind editing 
and posting my blog, but spoilers for June 21:

$ alias ee='echo ' def='abc xyz ' xyz='abc '
$ ee def xyz
abc xyz abc

That's just ANNOYING. (And no, it's not the recursion guard. The 
trailing space to retry thing seems to only apply at the top level AND 
it tracks when it's consumed all the input string provided BY the top 
level, even when the length of string is modified by further alias 
substitutions. I'm performing the string and looping parsing 
non-recursively because I'm trying to support nommu with potentially 
very limited stack depth, they're probably doing something recursive.)

> although now that I look, it
> doesn't quite match bash, although it is tied for the #2 in terms of
> tests passing:
> 
> https://pages.oils.pub/spec-compat/2025-06-26/renamed-tmp/spec/compat/alias.html
> 
> (I don't think bash is always right, but often it is.)

I'm usually interested in more tests. And I never claimed 100% bash 
conformance: even BASH doesn't have 100% bash conformance. (Half my 
arguments with Chet _cause_ version skew, which I see as a bug and he 
sees as a feature. I ask him to explain a corner case and he FIXES it, 
but then what do I put in the test suite if I want to pass TEST_HOST on 
"bash"? Do I have to check bash VERSIONS?)

Why isn't bash CONSISTENT?

$ alias potato='abc\def'
$ potato='abc\def'
$ alias potato
alias potato='abc\def'
$ declare -p potato
declare -- potato="abc\\def"

Pick a quoting style! (And the new universal one is $'blah' but see the 
xkcd on standards again...)

> Andy
Rob