[Toybox] I hate the GNU design aesthetic.

Fri Sep 30 20:42:14 PDT 2022

On 9/30/22 10:10, enh wrote:
>     Can't vi do search and replace without explicitly being sed syntax?
> 
> not that i know of? 1,$s///g is the vi syntax, no? call it "ed" rather than
> "sed" if you prefer, but it's basically the same, no?

Is that the syntax? I've honestly never used it. (Taught it to a class of "intro
to unix" students once out of a textbook, and graded their tests, but it never
made it into my own personal working set. Too many us harry potter
americanizations where Ron said something apartmently...)

>     > 2. reuse sed entirely. now you have the problems you've been dealing with.
>     (plus
>     > a lot of curious "what does <thing> even mean?" questions, because sed is too
>     > general for this use case.
> 
>     If you're regexing so your extract tries to overwrite /etc/passwd or something
>     then allowing arbitrary inputs into the regex pattern is already verboten-ish.
> 
>     The gnu/dammit sed has a --sandbox mode that disables the r/w/e commands. What's
>     "e"? Execute the pattern space as a shell command! Every invocation of sed can
>     rm -rf your home directory! Because gnu!
> 
>     (I did not implement 'e' in toybox's sed, or in busybox's sed. I checked to see
>     if they'd added it and as of August the answer is no... and I'm still listed as
>     the maintainer at the top of that file. The last commit attributed to me was
>     2009. Backing away slowly...)
> 
>     > luckily i doubt we'll ever have to answer those
>     > questions, because i doubt anything esoteric is likely to be used.
> 
>     Famous. Last. Words.
> 
> well, "you're doing this to yourself".

I'm doing (2) from your list.

I'm comfortable not doing things, and I'm comfortable doing them right. Drawing
a line between the two means having a map of the territory so I understand where
I'm drawing the line.

> GNU tar doesn't support any of this,
> busybox tar even less.

The devuan busybox in my $PATH is from 2019, but it doesn't have --xform or
--transform at all?

> so it's only toybox tar where someone _could_ get
> themselves into a mess with this in the first place. (and tbh, i still haven't
> seen an actual motivation beyond "orthogonality". which is a fine goal all other
> things being equal, but "massive added complexity",

It's a fairly tiny amount of code either way. "Users can then do horrible things
with it" is a unixversal constant.

That said, I could have the sed --tarxform flag lock out all sed commands except
's'. But that still wouldn't have settled the question of whether to IMPLEMENT
flags= as a parse time or runtime modifier...

> "ability to construct tar
> commands that no-one can read [because hardly anyone knows sed beyond s/// any
> more]",

I need to do a video...

> "interoperability issues with gnu/busybox", and "possible unintended
> consequences" all sound like reasons to believe all other things are _not_ equal
> here :-) )

My busybox git repo was last pulled in august, but searching for "form" in git
log archival/tar.c I didn't see anything about this feature? (They have
"transformer" but that's what they call their selected compression method.)

>     > 3. rewrite that one command from sed. now you have duplication, and potential
>     > skew between "actual s command" and "fake s command".
>     >
>     > option 3 is not obviously a bad choice given the issues with the alternatives,
>     > but it's a bad fit for anyone trying to do option 2 instead.
> 
>     Implementing behavior is easy. Figuring out what the behavior should BE is hard.
> 
> tbh, this is where i like the (usual) rob landley toybox philosophy of "i'll
> implement it when we have a motivating example of someone trying to get
> something done with it, not just because it's mentioned in the docs".

Failure mode: there isn't a spec, so I have to reverse engineer a spec to figure
out what subset of that spec I want to implement. (Implicit specs tend to look
kind of ugly when articulated, because they've never been held up for that kind
of review, unless they were made by somebody cultivating bonsai...)

> i think
> that's a great pragmatic philosophy (in a world dominated by dogmatic
> philosophies, to which group "orthogonality" -- for all its merits at times --
> tends to belong). it also has the nice side-effect of letting reality guide
> where to spend your time, because it means you're focusing on things that users
> demonstrably need rather than stuff that someone might want someday.

I am not the one who opened this can of worms. :)

> with this tar sed stuff i feel like i'm watching a man drill holes in his own
> head, all the time crying that it hurts :-)

Nah, just bashing my head against gnu stuff. The gnu stuff generally gives
first. Me being angry at code is a normal part of my working style:

  "For as long as she'd known him, Sam Vimes had been vibrating with the
  internal anger of a man who wants to arrest the gods for not doing it right."
  - Terry Pratchett

  "Writing is easy. You only need to stare at a blank piece of paper until drops
   of blood form on your foreahead."
  - Gene Folwer.

I normally just blog about this stuff, but I'm a bit behind on editing my blog
so if I put it there nobody's see it in time to affect the result, and I wanted
to see if anybody could point out obvious stuff I'd missed.)

> (though you are, i think, collecting a hell of a lot of circumstantial evidence
> that the original implementors didn't think this through. but to me that says
> "so neither should you" --- just do the minimum,

Yes, but my minimum was "reuse sed as a pipe filter like gzip rather than
reimplementing a big chunk of it", and once I'd gone there it had implications
to work through.

> assume the weird shit is as
> useless as it appears, move on with your life until/unless someone comes along
> who actually does need more.

I would happily ignore the flags, except the dozen or so --xform examples you
dug up from AOSP a while back had multiple uses of the flags, and even a couple
of flags= instances. And sed will error out on unknown search syntax if I don't
teach it about them.

And if I'm GOING to teach it a thing, it should thing properly.

> a motivating example often makes things clearer.
> the lack of one is often a sign you were right to ignore the whole mess :-) )

I ignored --xform until it was brought it to me as a thing tar should do. :)

>     And turning it into proper tests cases... (If I don't say "v" then I just get
>     one/three without "link to uno/quatro", but I add the v it gives me irrelevant
>     user and timestamp info, although I guess I've already got test invocations that
>     regularize all that....)
> 
> /me wonders how much of this gnu behavior is even deliberate versus accidental,
> and thus likely to be the kind of test that suffers from debian version skew
> if/when anyone actually tries to use the gnu version.

Chet keeps assuring me that bringing him weird corner cases to adjudicate, which
sometimes make him change the bash code and introduce version skew, makes bash a
better program and is thus the right thing to do.

I see his point. I totally do. And yet...

>     Rob

Still Rob