[Toybox] awk (Re: ps down, top to go)

Andy Chu andychup at gmail.com
Mon May 30 20:02:47 PDT 2016


> If you're restricting it to Bourne compatible, you're cutting out things
> like Bill Joy's csh and plan 9's "rc"...

Yes, POSIX defines a Bourne-compatible shell.  csh isn't
Bourne-compatible; I don't think it's ever been standardized.

>> And only 2 of them were started with unpaid labor.  They are:
>>
>> - Almquist shell (dash, NetBSD ash, busybox default sh)
>> - zsh (arguably not posix compatible, but it can be made posix
>> compatible by setting lots of flags)
>> - bash
>
> The Bourne Again shell was not the original Bourne shell, so I'm already
> confused by your selection criteria...

I think the original Bourne shell was only available under a paid
license, so it wasn't FOSS (even though neither of those terms had
been invented).  Also, the original Bourne shell wasn't POSIX
compliant.  POSIX added a bunch of things like !.

AFAIK ksh was the dominant implementation at the time of
standardization, and a lot of its choices made it into the standard,
with bash closely following POSIX/ksh.

>> - Korn shell and derivatives (pdksh, mksh)  Solaris uses a derivative
>> of Korn shell.
>
> Um, ksh was based on the unix 7 bourne shell? So this is where bourne
> slots into your taxonomy...?

Yeah that's probably true.  My claim is that there are only 4 code
open source code lineages for a POSIX compliant shell.

This claim should be easy to disprove: show me some source code which
implements a POSIX shell, and doesn't share a common ancestor with one
of the four I mentioned.


> For IP reasons, Minix and Coherent had their own shells written from
> scratch. In busybox lash and hush were fresh rewrites (and hush is
> reasonably usable on nommu, ash doesn't build for that). I believe david
> bell wrote sash from scratch. There are _several_ different craptacular
> shells in uclinux (nwsh is 775 lines of C and msh "minimal shell" is
> _53_ lines of C. Through use of horrible macros. But it has pipe and
> redirect!)

Actually I poked at hush last night, and actually I think it counts as
#5.  It's the newest one, started in 2001 by Larry Doolittle.  It has
some trivial problems like ~ expansion being nonexistent, so it's
technically not POSIX compliant, but I think the spirit is there.  A
lot of things work, and it's 10K LOC.  I think it's the closest in
architecture to my implementation that I've seen.

Most toy shells are more like a Thompson shell -- they implement
commands and redirections and pipes, but not the Pascal-like
procedural language that is what Bourne added, and what POSIX
requires.

My research led me to believe that Minix uses an Almquist Shell
derivative like NetBSD.  Which Minix shell are you talking about?  If
it fits my criteria, it should be available to inspect.

If something is 775 lines of C, there is no way it can be POSIX
compliant, and thus not really worth talking about here.


> Plus other oddballs like
> https://en.wikipedia.org/wiki/Friendly_interactive_shell and
> https://en.wikipedia.org/wiki/BeanShell and
> https://en.wikipedia.org/wiki/Scsh and so on. A friend of mine's website
> used to give you a shell prompt, implemented in javascript...

None of those are remotely close to  POSIX shells...


>> It takes 5,000-10,000 LOC to make a POSIX compatible shell,
>
> Nah.

OK there's only one way to settle that argument :)  I don't think you
can do it with 5K LOC even not counting the common toybox code.  And
certainly not if you count all the lines the shell depends on.

I have an extensive test suite.  I think hush will qualify but I
haven't tested it yet.  hush might be fairly compliant at the language
level, but it's not as usable interactively (which is why it's not the
default in busybox?)

hush is 10 KLOC (+libs), which is the smallest.  busybox ash is 13
KLOC (+libs).  dash is about 19 KLOC, and mksh 31 KLOC.

I think a POSIX shell will be 5K-10K LOC, but a *usable* POSIX shell
will be 10K - 15K.  There are a lot of things people expect these
days.

> The first commercial Unix clone shipped in 1980:
>
>   https://en.wikipedia.org/wiki/Coherent_(operating_system)
>
> Dennis Ritchie was sent by AT&T to audit it, and confirmed that there
> was no AT&T code in there.

Sure, but the question is if it had a POSIX shell.  I doubt it.  Like
I said, most people at that time probably shipped a Thompson-like
shell, without functions and control flow.

> There were lots of other lineages too, many lost to history in AT&T's
> great System V push, which also gutted a lot of the BSD use. (That's why
> IBM's AOS (https://en.wikipedia.org/wiki/IBM_Academic_Operating_System)
> got rebooted as AIX, and SunOS got rebooted as Solaris: because AT&T was
> pushing people to use its System V intead of BSD, and using vague legal
> threats to do it, which Berkeley's Computer Science Research Group
> eventually successfully fought off in
> https://www.bell-labs.com/usr/dmr/www/bsdi/bsdisuit.html but that took
> until 1993 to wrap up, by which point Linux was 2 years old.)
>
> This of course predated posix by many years, but it shows that there's a
> lot of different taxonomies out there.

OK, but again my claim is easy to disprove if it's incorrect.  I'm
talking about open source POSIX compliant shells.  I claim there are 4
distinct code lineages (with hush being a probable 5th).  You need a
significant chunk of code and a lot of testing to make a POSIX
compliant shell; it won't just happen by accident.  It's not a weekend
project.  :)

Andy



More information about the Toybox mailing list