[Toybox] Imported toysh blog test cases into OSH

Sun Jun 28 23:06:18 PDT 2020

On 6/26/20 2:52 PM, Andy Chu wrote:
> (contributor from 4 years ago here, no longer on the list)
> 
> Hi Rob, I noticed that you've again been trying to figure out bash on
> the blog: http://landley.net/notes.html

If you mean I've made 60 commits to toys/pending/sh.c in the past year:

  $ git log 7fceed5f75..master toys/*/sh.c | grep '^commit' | wc -l
  60

Um, yes?

> I imported 28 shell snippets as test cases into my shell test framework:
> 
> http://travis-ci.oilshell.org/jobs/2020-06-26__18-40-53.wwz/_tmp/spec/survey/toysh.html
> (7 cases running against 3 shells)
> 
> http://travis-ci.oilshell.org/jobs/2020-06-26__18-40-53.wwz/_tmp/spec/survey/toysh-posix.html
> (21 cases running against 6 shells)
> 
> (these specific links will go away but they're always 2 clicks away
> from http://travis-ci.oilshell.org/jobs/  -- ovm- tarball  -> spec
> tests )

I've been adding a bunch myself:

  https://github.com/landley/toybox/blob/master/tests/sh.test

Although the past few days A) I've been distracted with other things, B) I've
been working on the actual code to implement stuff:

  https://github.com/landley/toybox/blob/master/toys/pending/sh.c

$ git diff toys/*/sh.c | diffstat
 sh.c |  426 +++++++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 290 insertions(+), 136 deletions(-)

Trying to get to a good stopping point to check the next round in...

(Speaking of distracted, my laptop BIOS has decided to randomly unsuspend
itself, 100% when it's plugged into wall current and apparently now when it was
in my backpack getting _very_ hot, so I need to close all the open windows and
completely power down to see if that fixes it. So today is another "do that
instead of banging on sh.c" day, it seems, given 8 desktops of open windows and
so many email windows like this I hit "reply" but have a lot of remaining typing
before hitting "send". They kind of accumulate...)

> Summary:
> 
> - OSH passes 22 cases, more than any other shell

I'm trying to implement a bash clone. I've been having long threads with the
bash maintainer about the behavior of his shell:

  http://lists.landley.net/pipermail/toybox-landley.net/2020-June/011868.html

I haven't been looking at a lot of other shells recently, because bash is the
important one.

> - The cases found 3 divergences/bugs in OSH, 2 unimplemented features,
> and 3 (documented) known differences

That's nice.

I eventually need to compare my shell's behavior against mksh because that's
what Android's currently using and I want to minimize friction there. But pretty
much bash, mksh, and posix are the three frames of reference I care about for
this project.

(And last month $DAYJOB had me look at a 20 year old shell implementation
uclinux was using, which turned out to be _craptacular_ and I _think_ is an
ancient fork of the minix shell from the early 1990's? Still, 6 small C files,
has the advantage of brevity. And it supported nommu.)

> Results: https://github.com/oilshell/oil/issues/780
> 
> So thanks for posting these cases.

You're welcome. It's all public...

> For background, OSH is able to run many real bash scripts -- in fact
> Aboriginal Linux and other Linux distro scripts in January 2018 was a
> significant milestone.

I replaced aboriginal linux with mkroot a couple years ago now, and these days
the build script I'm using is merged into toybox:

  https://landley.net/toybox/faq.html#mkroot

Building a system in 250 lines of bash, still kinda proud of that.

> While I'm grateful for the test cases, reading the blog is a bit
> painful because it's clear you will never finish your shell with the
> current strategy (and this is an informed opinion, after doing it
> myself).

If you say so...

> I think you're less than 10% done with 3K lines of code.

Personally, I think I'm well over halfway. (I'm trying to get the whole thing
done in 3500 lines of C, although I think I'm going to cheat and not include the
command history editing plumbing in that.)

> I think you will agree with my assessment

I operate in a different frame of reference than you do. This seems to bother you?

> if you scroll through this issue:
> 
> https://github.com/oilshell/oil/issues/653

Let's see, wget https://github.com/akinomyoga/ble.sh and... yeah all those
"doctype" html tags will confuse the shell, yup. Ok, navigate to the site... 7
subdirectories, and a makefile. Something claiming to be a shell script has a
makefile.

*shrug* I'll add it on the todo heap along with the zsh test cases I haven't had
a chance to look at and so on, but... that's not a test case. That's a "package
that doesn't work with this yet for X number of reasons". And it's not a package
anybody other than you has expressed interest in yet.

Right now I'm trying to implement what the bash man page says to do. There's a
lot of man page to get through before I worry about other stuff.

> Your test cases are good, but there are so many features you haven't
> even begun to think about.

You know what I've thought about? That's an amazing skill, some dude named James
Randi was offering a million dollars for proof of that. (I think he retired though.)

I read all the posix spec (albeit in susv3 not susv4), read all the bash man
page (although again, the cover-to-cover pass was a while ago), read large
chunks of pdksh and a very old version of the minix shell and back when I
maintained busybox I read through the 4 shells there (mostly because they were
broken)... I _first_ started writing my own shell from scratch back in 2006:

  https://git.busybox.net/busybox/commit/?id=02add9e53a24

But sure, I haven't begun to think about stuff. I'm flattered.

> A couple things I would suggest:
> 
> 1) Learn about grammars and parsing, and look at the POSIX shell
> grammar, which all shells obey.

I got the word parsing and line continuation plumbing in last year. It's already
doing all that part just fine. (It may need some regression testing with the
changes I've made more recently, but nothing fundamental that I'm aware of. I'm
happy with my approach.)

> There are a few test cases that are
> completely non-controversial and answered by POSIX that you were
> confused by -- e.g. test case #12 on for loop parsing, #5 on
> pipelines, #21 on pipelines all run under 6 shells.  It's POSIX, and
> shells follow the POSIX grammar.  POSIX is necessary but not
> sufficient (I would say it covers ~25% of bash's syntax).

I've been building a mental model. One reading of posix and one go through the
bash man page did not result in 100% retention of all aspects in said mental
model. I admit to being imperfect.

Heck, I'll go so far as "not actually very good at this, just persistent". You
want to demonstrate you're better than me, go for it. I'm unlikely to object.
(Also unlikely to care.)

> Parsing and grammars are a compact way of reasoning about corner cases

I'm trying to get a small and simple result, so I'm not just banging out code to
do the thing but trying to figure out what the MINIMUM I need to do to
accomplish it is. Which is a harder problem than just implementing it.

If I wanted to abandon nommu support this would be SO MUCH EASIER. Even stuff
like $! involves state that needs to be marshalled to child processes and I am
NOT comfortable sending across a struct because I don't want to open strange
segfault territory so it gets turned to text and back like the rest...

> -- more compact than C code.

"Yes, but what is it DOING?"

I've encountered two main types of programmers: ex-mathematicians, and would-be
plumbers. There are a very small number of people who can do both equally well,
but I _tried_ to hire Fabrice Bellard for a project and he was BUSY.

I got a math minor in college for the same reason an acrophobic person would go
skydiving: I refused to be beaten by it, but it never spoke to me. I'm a
mechanic. I can build a horrible clockwork abomination to do X, and then clean
it up to be less ugly. I've often described my coding style as "debugging an
empty screen". My first instinct is to pile up heuristics until they cover the
problem domain. It all boils down to a series of operations the hardware will be
performing on registers and memory locations and such anyway.

In college I had a comparative programming languages survey course with a
section on prolog, and the first time I sat down at a prolog interpreter and ran
my first ~10 line program it locked the interpeter in an endless CPU-eating
loop, which the professor had just spent three days explaining to us was
impossible. He looked at it and told me I had to understand how the prolog
interpreter was _implemented_ in order to avoid doing that.

I know my way around a city when I've run out of ways to get lost. I feel
comfortable with a tool when I've run out of ways to break it.

>  Run say
> 
> osh -n -c 'echo ${@@Q}'
> 
> to see that in action.

I can just go:

  $ xx(){ echo ${@@Q};}; xx a b c
  'a' 'b' 'c'

and that shows me what it's doing.

The hard part's never been getting it to work. The hard part's figuring out what
it should do. Implementing a behavior is easy, figuring out what the behavior A)
is B) means is the hard part.

Also "what bash does" and "what the man page documents" have always had gaps
between them. And I have wasted HOURS by testing "./sh -c blah" vs "sh -c blah"
(because it's easy to cursor up and add/remove the ./ to the test) and going
"but how could bash get something so simple wrong?" and then realizing "duh,
/bin/sh is the defective annoying shell"...

> 2) Use some of the test cases and framework I developed and automated:
> http://travis-ci.oilshell.org/jobs/2020-06-26__18-40-53.wwz/_tmp/spec/survey/osh.html

See "todo heap after zsh and mksh"...

> This test suite includes knowledge from other people,

... so does bash? So does toybox? Isn't that sort of the point of open source
projects?

> e.g. the author
> of what I believe is the biggest bash program in the world (ble.sh at
> 30K+ lines).  Another shell semantic for you to ponder:
> 
> https://github.com/oilshell/oil/issues/706#issuecomment-615578349
> 
> Summary: bash is inconsistent with itself regarding "unset".

It's inconsistent with itself regarding a large number of things, yes. That's
why Elliott told me to ask its maintainer questions, as I've been doing here.
(It's what all these test cases are about.)

I am not to the point of looking for more to do yet. I have not started the
$((1+2*3)) math parser yet. I haven't implemented case statements yet. I haven't
started array variables yet. I haven't implemented function() support yet (the
hooks are there but there's recursion and assignment lifetimes and aliases and
marshalling function definitions into nommu subshells which bash has version
skew about by the way...). I've done maybe the first 1/3 of job control. I'm hip
deep in filling out variable resolution transformations.

  $ grep TODO toys/*/sh.c | wc -l
  73
  $ wc -l sh.todo
  842

I'm good for now, thanks.

> After
> much analysis, OSH chose a simpler behavior that can run the biggest
> bash programs in the world.

Good luck? I'd rather test the scripts directly...

(I'm familiar with the "I already reinvented this wheel, therefore you
shouldn't" impulse, but the only _polite_ approach I've found is all carrot, no
stick.)

> Andy

Rob