[Toybox] Would someone please explain what bash is doing here?

Mon Mar 9 13:10:33 PDT 2020

On 3/8/20 2:57 PM, Chet Ramey wrote:
> Remember the brouhaha (this was at least 15 years ago) about the standard
> saying that `set -e' only applied to simple commands and bash having the
> audacity to implement what the standard said? Good times.

Bit too inside baseball for me. Computer history is a hobby of mine, ala
https:;//landley.net/history/mirror, but my dreams of writing a book on the
subject got kinda back-burnered years ago because my todo list runneth over.

Instead I'm trying to hijack Android and turn it into a self-hosting development
environment so the PC can go the way of the mainframe and minicomputer before
it, as I explained on https://landley.net/toybox/about.html . This spun off of
my previous work creating the simplest possible Linux system that could rebuild
itself under itself, first at https://landley.net/aboriginal/about.html (7
packages) and now I've got it down to about 250 lines of Shell script ala
https://github.com/landley/toybox/blob/master/scripts/mkroot.sh

I want to convince Android to create a "posix container" based on "mkroot plus
stuff" within which AOSP image builds can run to completion on arbitrary android
devices, but first I need to get Linux From Scratch building under that again as
a proof of ocncept. (In Aboriginal Linux I had LFS building automatically on
multiple architectures starting from nothing but linux, busybox, uClibc, gcc,
binutils, make, and bash, the automation infrastructure was
http://landley.net/aboriginal/control-images/. Ideally I can get that down to 4
packages if I finish the toybox roadmap (linux, qcc, musl, and toybox), but the
nearer-term goal is getting the android NDK to act as a full system build
compiler. (Elliott is the Android base OS layer maintainer, he doesn't maintain
the NDK but I've made a lot of puppy eyes at him over the past couple years to
get toybox to build reasonably well with it.)

The compiler I _want_ to use for a fully auditable base system that can defeat
Ken Thompson's trusting trust attack (https://arxiv.org/abs/1004.5534) is QCC,
which would combine Fabrice Bellard's tcc with qemu's TCG to produce qcc (ala
https://elinux.org/CELF_Project_Proposal/Combine_tcg_with_tcc and
http://landley.net/code/qcc/). I abandoned it for a while because cfront was
just too stale and jumping from C to C++ without that wasn't feasible, but then
I found https://github.com/JuliaComputing/llvm-cbe so it's back on...

That's what all this shell nonsense is for. The toybox about page I linked to
above has a video from 2013 where I laid out the whole scheme in an ELC talk...

(That's without even opening the https://j-core.org can of worms.)

>> The bash man page defines "IFS whitespace" as different from unicode whitespace.
>> (Space, tab, and newline only. Mine will in theory take the non-blank oggham
>> whitespace, although I haven't added that to tests/sh.test yet. :)
> 
> Bash will, too. If you want to put a non-breaking space into $IFS, it will
> be happy to split words on it. The business about "IFS whitespace" being
> space/tab/newline is to reconcile differences between historical behaviors
> that date back to an ASCII-only world. You have to live with those.

Mine is collating the other (nonbreakable) whitespaces. Haven't checked what
bash is doing, just noted that the man page makes a gratuitous distinction there.

>> Posix is in there, but what the linux command line in my host distro does is at
>> least as important. 
> 
> How do you reconcile the differences when bash and dash (as /bin/sh) do
> different things? Dash is most definitely a posix-and-little-else shell.

Ah, the Defective Annoying SHell. I have OPINIONS there.

  https://landley.net/notes-2010.html#28-10-2010

I've been railing against that since it went in.

In 1991 Linus Torvalds created Linux by extending a terminal program he'd
written to run Bash. Linus's term program booted from floppy disk because
minix's microkernel interrupt handling requiring two context switches for each
character couldn't keep up with a 2400 baud modem. He then taught it to read and
write the minix filesystem so he could upload/download files to his university
microvax (his usenet connection), then he wanted to be able to rm/mv/mkdir
without rebooting into minix so he implemented the system calls necessary to run
bash. First by reading the sun workstation man pages in the university library
and implementing those, then with a printk() to say what unimplemented system
call it had tried to run next. And then when he'd done that, he noticed he was
only a few system calls short of running gcc and making the thing self-hosting.
He informed the comp.os.minix community which had been maintaining a huge patch
stack to turn Minix into a real system (which Tanenbaum would never merge
because he wanted a simple teaching tool and had sold the rights to a textbook
company), and they ported their work over to his new kernel and he MERGED them
(which was jaw-on-floor stunning to them), and suddenly Linux inherited a large
active development community and went from 0.0.1 to 0.9.5 in a couple months,
and when Tanenbaum came back from summer vacation he kicked them off
comp.os.minix so they got their own list:

  http://landley.net/history/mirror/linux/1991.html
  http://landley.net/history/mirror/linux/1992.html

And had the "tanenbaum-torvalds debate":

  https://www.oreilly.com/openbook/opensources/book/appa.html

and the rest is history. (This is all in his autobiography "Just For Fun".)

I.E. bash was the first program Linux ever ran, and was THE shell of every Linux
distribution for the first 15 years of Linux, and nobody QUESTIONED this until
Ubuntu screwed up in 2006.

Right about when I was first doing serious study of how shells work
(http://landley.net/notes-2006.html#26-11-2006) Ubuntu decided that its init
scripts were running too slowly, but that changing the #!/bin/sh at the start of
each init script was too intrusive a change. no really, they DOCUMENTED THIS:

  https://wiki.ubuntu.com/DashAsBinSh

So they changed the #!/bin/sh symlink to point to a shell that that broke the
kernel build, got tty handling wrong (fork a background task with &, ctrl-c
killed it), and so on. It was a HORRIBLE MISTAKE, and worst of all it didn't fix
the problem that had caused them to do it in the first place, their init scripts
STILL ran too slowly, so they had to parallelize the build system by creating
"upstart" shortly afterwards.

And they NEVER ADMITTED THEY MADE A MISTAKE. Instead what they did do is make
every account login shell default to #!/bin/bash so you only ever saw the
Defective Annoying SHell if your script said #!/bin/sh and you hadn't manually
fixed the symlink yourself. So everybody made their shell say #!/bin/bash at the
top because /bin/sh is broken on ubuntu systems. (And on debian systems, which
he pushed the change upstream into. Debian had nearly died due to its flamewars
excalating to the point where they couldn't get a release out for almost 5 years
until Cannonical hired a couple full-time engineers to work on Debian and get
the engineering backlog under control because the distro Ubuntu was based on
dying out from under it was bad PR. During "debian stale" a bunch of Debian
developers fled the flamewars to Gentoo, where they outnumbered the original
Gentoo devs about 4 to 1 and suddenly it was flamewars as far as the eye could
see _there_ for a year or so, and although they went back to Debian as soon as
cannonical fixed it, this severaly damaged the Gentoo development community in
ways it STILL hasn't fully recovered from. More or less a lost decade.)

No, I don't care what dash does. It can go hang.

> Chet

Rob