[Toybox] awk seen in the wild

Andy Chu andychup at gmail.com
Wed Jul 20 21:41:16 PDT 2016


On Sun, Jul 17, 2016 at 12:58 PM, dmccunney <dennis.mccunney at gmail.com> wrote:
> On Sun, Jul 17, 2016 at 3:08 PM, Andy Chu <andychup at gmail.com> wrote:
>> However, I did a bunch of research and hacking on Kernighan's Awk.  I
>> was trying to morph it into a "proper" modern language.  For example,
>> you could imagine writing "ls" or "xargs" or even a shell in Awk, sort
>> of like the idea to write tools in Lua.
>
> Er, why?
>
> I attended a talk decades ago by Tom Weinberger, the W in awk.  He
> stated that awk was originally designed for writing "one liners" to be
> run from the command line in a terminal, and described his shock the
> first time he saw a multi-page awk script.
>
> There are lots of domain specific little languages, intended to
> address particular problems, and attempts to expand them rapidly grow
> out of scope.  You reach a point where what you are trying to do is
> something else's job, and you should simply use the something else.

I'm well aware of that point of view, and it's totally valid if you're
an language *user*... I'm taking the point of view of a language
implementer.  Kernighan's Awk is only 6K lines of code and not very
difficult to modify.

The motivation is exactly the same as writing Unix tools in Lua --
shorter code, smaller binary size, and eliminating certain classes of
bugs a priori.  Awk is not that far semantically from Lua or
JavaScript -- it's an interpreted language with hash tables.  The main
difference is that it doesn't have an object system, but I viewed that
as an advantage, because I don't like either Lua's or JavaScript's OOP
support.   But as mentioned, after a little hacking, I didn't think it
was feasible.

> The original design of Unix was one tool for one job, with shell
> scripts the glue to tie together disparate tools to perform complex
> operations, and text files as the common medium to pass things between
> them.  That ideal has been increasingly lost, and perl might be
> considered an example.  Its "there's more than one way to do it"
> paradigm is likely a greater weakness than a strength.  There is more
> than one way, but which is best?  How do you know?"  And the nature of
> the perl language leads to the old joke in classifying languages by
> how you shoot yourself in the foot:
>
> Perl: You shoot yourself in the foot, but no one else can understand
> how you did it.  Six weeks later, neither can you.
>
> Awk was created to perform a particular job and fill a particular
> niche.  Attempting to expand it beyond that is likely an error.

It's *already* been expanded, and so have make and shell.  That ship
sailed decades ago.  As of a couple years ago, Android used 200K+
lines of Make.  It seems to include the "Gnu Make Standard Library",
which is basically a Lisp threaded through a tiny hole in Make syntax.

That misfeature in the way Unix evolved was the point of the first
part of my message, with the function call examples.  The idea I have
floating around is to turn my shell into a tool that also contains
make and awk.  75% of the lines in a typical Makefile ARE shell.  And
Awk has function calls, loop, boolean expressions, arithmetic,
builtins, regexes, etc. just like shell.  If you squint, it's almost a
different syntax for the same language.

After hacking on Kernighan's awk and Mozilla's pymake, I figured out a
lot of the semantic differences... and abandoned the idea of doing
anything highly compatible with awk and make.

Shell actually has good semantics (it's a thin wrapper around classic
Unix syscalls), but terrible syntax.  Awk has a good syntax (it was
written with an actual parser, unlike shell and make) but bad
semantics.  Make has both bad syntax and bad semantics.  Happy to go
into detail if anyone is interested :)

Andy



More information about the Toybox mailing list