[Toybox] I hate the GNU design aesthetic.

enh enh at google.com
Mon Oct 3 09:14:39 PDT 2022


On Fri, Sep 30, 2022 at 8:33 PM Rob Landley <rob at landley.net> wrote:

> On 9/30/22 10:10, enh wrote:
> >     Can't vi do search and replace without explicitly being sed syntax?
> >
> > not that i know of? 1,$s///g is the vi syntax, no? call it "ed" rather
> than
> > "sed" if you prefer, but it's basically the same, no?
>
> Is that the syntax? I've honestly never used it. (Taught it to a class of
> "intro
> to unix" students once out of a textbook, and graded their tests, but it
> never
> made it into my own personal working set. Too many us harry potter
> americanizations where Ron said something apartmently...)
>

yeah, that's the syntax, and i strongly suspect that the only reason most
people know _that much_ sed is that they've used vi. which is ironic given
that vi has that syntax because it was a transferrable skill from ed/sed in
the 1970s...


> >     > 2. reuse sed entirely. now you have the problems you've been
> dealing with.
> >     (plus
> >     > a lot of curious "what does <thing> even mean?" questions, because
> sed is too
> >     > general for this use case.
> >
> >     If you're regexing so your extract tries to overwrite /etc/passwd or
> something
> >     then allowing arbitrary inputs into the regex pattern is already
> verboten-ish.
> >
> >     The gnu/dammit sed has a --sandbox mode that disables the r/w/e
> commands. What's
> >     "e"? Execute the pattern space as a shell command! Every invocation
> of sed can
> >     rm -rf your home directory! Because gnu!
> >
> >     (I did not implement 'e' in toybox's sed, or in busybox's sed. I
> checked to see
> >     if they'd added it and as of August the answer is no... and I'm
> still listed as
> >     the maintainer at the top of that file. The last commit attributed
> to me was
> >     2009. Backing away slowly...)
> >
> >     > luckily i doubt we'll ever have to answer those
> >     > questions, because i doubt anything esoteric is likely to be used.
> >
> >     Famous. Last. Words.
> >
> > well, "you're doing this to yourself".
>
> I'm doing (2) from your list.
>

yeah, but i felt like that was the "now you have two problems" choice of
the "only bad choices" choices :-)


> I'm comfortable not doing things, and I'm comfortable doing them right.
> Drawing
> a line between the two means having a map of the territory so I understand
> where
> I'm drawing the line.
>
> > GNU tar doesn't support any of this,
> > busybox tar even less.
>
> The devuan busybox in my $PATH is from 2019, but it doesn't have --xform or
> --transform at all?
>

i assume you're arguing "...so they might add more sed commands"? maybe,
but given how few people seem to know _any_ sed any more, and how many of
those who do, don't know anything beyond s/// ... i'm skeptical.


> > so it's only toybox tar where someone _could_ get
> > themselves into a mess with this in the first place. (and tbh, i still
> haven't
> > seen an actual motivation beyond "orthogonality". which is a fine goal
> all other
> > things being equal, but "massive added complexity",
>
> It's a fairly tiny amount of code either way. "Users can then do horrible
> things
> with it" is a unixversal constant.
>
> That said, I could have the sed --tarxform flag lock out all sed commands
> except
> 's'. But that still wouldn't have settled the question of whether to
> IMPLEMENT
> flags= as a parse time or runtime modifier...
>

ooh, yeah, that seems to be a way to cut the gordian knot here? you get
"my" option 2 but without the "now you have two problems" side --- you can
just stop thinking about all the other commands (that the "reference
implementation" doesn't support anyway, and no-one has yet asked for). and
the sneaking suspicion both of us share that you can do something really
awful with arbitrary sed commands goes away :-)


> > "ability to construct tar
> > commands that no-one can read [because hardly anyone knows sed beyond
> s/// any
> > more]",
>
> I need to do a video...
>

(maybe. _i'd_ watch it, but i think everyone else is happier with python
anyway, and tbh i'd rather read their python scripts than try to wrap my
head around someone's sed one-liner. and even if i watch your video, i'm
not likely to remember it. the old "i only change the VCR clock twice a
year, so i'll never really learn how to do it" problem.)


> > "interoperability issues with gnu/busybox", and "possible unintended
> > consequences" all sound like reasons to believe all other things are
> _not_ equal
> > here :-) )
>
> My busybox git repo was last pulled in august, but searching for "form" in
> git
> log archival/tar.c I didn't see anything about this feature? (They have
> "transformer" but that's what they call their selected compression method.)
>

no, that was my point --- if someone _does_ write a --transform script that
uses more than just s///, now they _need_ toybox tar because other tars
don't have it. (unless you assume that busybox would definitely implement
the extended toybox syntax rather than the restricted gnu syntax, i don't
think "busybox doesn't even have --transform at all" weakens the point much
:-) )


> >     > 3. rewrite that one command from sed. now you have duplication,
> and potential
> >     > skew between "actual s command" and "fake s command".
> >     >
> >     > option 3 is not obviously a bad choice given the issues with the
> alternatives,
> >     > but it's a bad fit for anyone trying to do option 2 instead.
> >
> >     Implementing behavior is easy. Figuring out what the behavior should
> BE is hard.
> >
> > tbh, this is where i like the (usual) rob landley toybox philosophy of
> "i'll
> > implement it when we have a motivating example of someone trying to get
> > something done with it, not just because it's mentioned in the docs".
>
> Failure mode: there isn't a spec, so I have to reverse engineer a spec to
> figure
> out what subset of that spec I want to implement. (Implicit specs tend to
> look
> kind of ugly when articulated, because they've never been held up for that
> kind
> of review, unless they were made by somebody cultivating bonsai...)
>
> > i think
> > that's a great pragmatic philosophy (in a world dominated by dogmatic
> > philosophies, to which group "orthogonality" -- for all its merits at
> times --
> > tends to belong). it also has the nice side-effect of letting reality
> guide
> > where to spend your time, because it means you're focusing on things
> that users
> > demonstrably need rather than stuff that someone might want someday.
>
> I am not the one who opened this can of worms. :)
>

no, but you're the one who probed this deeply into said can :-) (though at
this point i don't actually remember whether i found users of the flags=
stuff in a debian package search. now you've made me think of it, i think i
_did_, so, yeah, it's probably used in the wild even if none of _our_
[known] users have needed it yet. in particular, the kernel build isn't
using flags=.)


> > with this tar sed stuff i feel like i'm watching a man drill holes in
> his own
> > head, all the time crying that it hurts :-)
>
> Nah, just bashing my head against gnu stuff. The gnu stuff generally gives
> first. Me being angry at code is a normal part of my working style:
>
>   "For as long as she'd known him, Sam Vimes had been vibrating with the
>   internal anger of a man who wants to arrest the gods for not doing it
> right."
>   - Terry Pratchett
>
>   "Writing is easy. You only need to stare at a blank piece of paper until
> drops
>    of blood form on your foreahead."
>   - Gene Folwer.
>
> I normally just blog about this stuff, but I'm a bit behind on editing my
> blog
> so if I put it there nobody's see it in time to affect the result, and I
> wanted
> to see if anybody could point out obvious stuff I'd missed.)
>
> > (though you are, i think, collecting a hell of a lot of circumstantial
> evidence
> > that the original implementors didn't think this through. but to me that
> says
> > "so neither should you" --- just do the minimum,
>
> Yes, but my minimum was "reuse sed as a pipe filter like gzip rather than
> reimplementing a big chunk of it", and once I'd gone there it had
> implications
> to work through.
>
> > assume the weird shit is as
> > useless as it appears, move on with your life until/unless someone comes
> along
> > who actually does need more.
>
> I would happily ignore the flags, except the dozen or so --xform examples
> you
> dug up from AOSP a while back had multiple uses of the flags, and even a
> couple
> of flags= instances. And sed will error out on unknown search syntax if I
> don't
> teach it about them.
>

(they were debian, not aosp. now you've made me look, i can report that of
_all_ the code indexed on google's internal code search system -- not just
android -- i see only three uses of --transform with flags=, and all three
are `flags=r;s///`.)


> And if I'm GOING to teach it a thing, it should thing properly.
>
> > a motivating example often makes things clearer.
> > the lack of one is often a sign you were right to ignore the whole mess
> :-) )
>
> I ignored --xform until it was brought it to me as a thing tar should do.
> :)
>

i think the difference here is that to you "--transform" is an individual
whole, whereas to me "--transform s///" is one thing, and "flags=" another,
and "full sed support" something that only you have ever even thought of.
(even if i understand _why_ you thought that, that's definitely in "you did
this to yourself" territory to me, and my favorite of the ideas you've come
up with is the "passing the flag to sed disables everything but s///".)


> >     And turning it into proper tests cases... (If I don't say "v" then I
> just get
> >     one/three without "link to uno/quatro", but I add the v it gives me
> irrelevant
> >     user and timestamp info, although I guess I've already got test
> invocations that
> >     regularize all that....)
> >
> > /me wonders how much of this gnu behavior is even deliberate versus
> accidental,
> > and thus likely to be the kind of test that suffers from debian version
> skew
> > if/when anyone actually tries to use the gnu version.
>
> Chet keeps assuring me that bringing him weird corner cases to adjudicate,
> which
> sometimes make him change the bash code and introduce version skew, makes
> bash a
> better program and is thus the right thing to do.
>
> I see his point. I totally do. And yet...
>
> >     Rob
>
> Still Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20221003/88e946dd/attachment.htm>


More information about the Toybox mailing list