[Toybox] awk (Re: ps down, top to go)

Andy Chu andychup at gmail.com
Mon May 30 01:23:22 PDT 2016


> I wasn't planning to use yacc.
>
>> * awk: All implementations except busybox awk use yacc (bottom up).
>
> I wasn't planning to use yacc here either.

That isn't relevant; the point is that awk and sh are suited toward
opposite parsing algorithms.  If you don't know the difference between
top down and bottom up parsing, you will quickly run into it when you
try to write the parsers, and if you try to share "parser
infrastructure".

>> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10
>
> No.
>
> Any time "bash is wrong but dash is correct", posix is wrong. Posix is
> saying that the de-facto Linux shell got this wrong for almost 20 years
> and nobody noticed, then a shell that I could trivially segfault when
> Ubuntu first swapped /bin/sh for it, and which "sleep 100 &" and then
> ctrl-c at the prompt would kill the backgrounded sleep... That was doing
> it "right".

I think you're being overly critical of dash here.  It's not like
toybox doesn't have segfaults.

mksh, a Korn shell derivative, also agrees with POSIX.

> No. No it wasn't. Posix was at _best_ irrelevant.

With regard to shell, this doesn't gibe with reality.  ALL shells I've
researched make great efforts to be POSIX compliant (with
compatibility flags, etc.), and *are* very POSIX  compliant.  The
POSIX standard greatly influenced their development.  And lots of
shell application writers make a conscious effort to be POSIX
compliant.

The allowed function body for bash is actually the only deviation I've
found in terms of the grammar (which again doesn't cover all of the
language), and it's not a huge deviation.

This argument doesn't really matter to me though... I'm not trying to
convince you to follow POSIX for your shell.

> I never look at FSF code. On general princples. But the behavior of the
> standard Linux command line is what Linux developers (and the build
> systems they write) expect.

I think build systems significantly more constrained problem, because
that is the one place where people care if their shells scripts are
portable.

Probably 75% to 95% of major packages are portable to BSD (without
installing bash on BSD).  If a package uses autoconf, then it will
very small subset of POSIX shell.  And even programs with hand-written
configure scripts like QEMU obviously are not using every nook and
cranny of bash in them.

>>> Keep in mind, over the years people have written a dozen different
>>> shells. It's really not that big a deal, I just want to do it _right_ so

The claim that there are dozens of shells didn't seem suspect to me at
first...  There are tons of Unixes that I've never used, both now and
in the past.

But now that I've done more research, I believe there have only been 4
open source Bourne/POSIX-compatible shells EVER.  And only 2 of them
were started with unpaid labor.  They are:

- Almquist shell (dash, NetBSD ash, busybox default sh)
- zsh (arguably not posix compatible, but it can be made posix
compatible by setting lots of flags)
- bash
- Korn shell and derivatives (pdksh, mksh)  Solaris uses a derivative
of Korn shell.

bash was started with the paid labor of Brian Fox, since apparently
Stallman was impatient with another volunteer effort.  And Korn shell
was open sourced in 2000, but its development was paid for by AT&T.

So my claim is that there is NO other distinct open source code
lineage which is even roughly POSIX compatible.  It takes 5,000-10,000
LOC to make a POSIX compatible shell, and you won't find that amount
of original code in any other implementation.  All shell variants are
pushing around ancient code from very few lineages.

Does anyone have evidence to the contrary on this point?

Moreover, I don't think there are any other commercial code lineages
besides the *original* Bourne shell, but I'm less confident about that
claim.

Andy



More information about the Toybox mailing list