[Toybox] [PATCH] Add the gzip/gunzip/zcat I wrote for toolbox.

Rob Landley rob at landley.net
Tue May 9 13:25:54 PDT 2017


On 05/08/2017 08:27 PM, enh wrote:
> On Mon, May 8, 2017 at 4:37 PM, Rob Landley <rob at landley.net> wrote:
>> On 04/26/2017 05:03 PM, enh wrote:
>>> if you're actually going to start to look, i'll attach my port of the
>>> current toolbox implementation.
>>
>> Ok, carving out a half hour to look at this... optargs should really
>> have [-123456789] at the end so "gzip -3 -9" is -9 but "gzip -9 -3" is
>> -3. While we're at it, OLDTOY() was there so these things could share
>> option strings, except why does zcat have -cd?
> 
> i just copied what the "real thing" does --- all three commands accept
> the same set of options, and just ignore the ones they can't use. i
> assumed _someone_ is relying on that. (i can't remember if i actually
> hit such an instance --- i don't remember how it was that i realized
> that they all accept all the options, but i do remember in the
> beginning i had a different getopt string for each one, but changed
> that even in the toolbox version.)

It's times like this where busybox is useful, since that's 10+ years of
an alternate implementation lots of people have had plenty of
opportunity to complain about. :)

(I vaguely recall that the IETF used to have a policy that standards
submissions had to have two different interoperable implementations.
This was a good policy.)

Checking busybox defconfig from february, zcat --help documents no
options...

  ./busybox zcat -c README.gz
  ./busybox zcat --fruitbasket README.gz

And is ignoring all options presented to it. (Blah. Proving nothing in
this instance.)

Hey, ubuntu's zcat will take non-gz filenames and look for a
corresponding gz file. (I.E. "zcat README" will display README.gz).
Should we do that? (Probably.) the man page says zcat is identical to
"gunzip -c", is this true here?

  gzip README
  gunzip -c README

Yup, that also finds the .gz version when not given a .gz. (The busybox
version doesn't have this trick though...)

Busybox gzip documents -dtcf but not the numbers. But it accepts the
numbers. And it's being specific about that:

  $ ./busybox gzip -7 --potato README
  gzip: unrecognized option '--potato'

Sigh. Ok, this time checking busybox was less useful than I'd hoped. :)

>> (And the three main
>> functions... if oldtoy isn't handling this right it needs fixing. Do you
>> care about the standalone builds, by the way? that was always the
>> complicating part, chording together shared infrastructure...)
> 
> i don't personally understand why someone would want one of these but
> not all three, no.

Running "make test_zcat" does a standalone build of zcat, which should
act like zcat and not like gzip or gunzip. So there's one use case. :)

>> I need to spend a 3 day weekend fixing the help infrastructure so it can
>> do reasonable includes, the duplication of -c and -f help text here
>> pains me.
> 
> the fact that the ps/top/pgrep help is just wrong pains me more :-)
> 
> note that -f isn't the same between all three, and the -c is slightly
> different for all three. (and they'd differ more if we behaved more
> like the "real thing" and documented accurately.)

What does zcat -c _do_? (Make it _not_ act like zcat? As far as I can
tell ubuntu's is ignoring it.) How would -f behavior differ? (Is this in
the tests you sent?)

I see "acts slightly different" and hear "special case" and wonder "is
there any way we could not?" Which is where I have to read the code and
the man page and come up with test cases and wrap my head around _why_
and then see if there are users out there depending on this weirdness
(although mostly I check package configure and builds because I can find
and automate a lot of those) and...

I miss free time. It was nice.

>> Should gzip do that? You have it exiting immediately... Huh, it looks
>> like the gnu/gnu/gnu/dammit version _does_ exit at the first error,
>> which seems awkward and wrong. So this matches the existing version, the
>> question remains what the behavior _should_ be.
> 
> in the absence of a strong reason to do anything different, i just did
> what i observed the "real thing" doing.

Oh sure. It's a great first pass, I'm just doing my normal gift horse
dentistry.

I try not to think in terms of "real thing", but instead the old IETF
model. The toybox implementation has the potential to become the "real
thing" for a lot of people in future. I'm trying to work out if there
_was_ a spec (I.E. if posix was functional), what it would look like?

Warn-but-continue is the common behavior elsewhere in toybox, because
that was common behavior in the existing utilities toybox was
implementing new versions of. I should survey the existing toybox
commands and see if there are _any_ that stop on error when handling
[FILE...] arguments, and if so (I don't remember any) compare the ubuntu
implementation to make sure it wasn't our mistake.

If gzip is the only oddball, I'd prefer to correct it in toybox. If
there are others, then it's not a special case...

> afaik the only differences are omissions.

Yeah, but _should_ there be? Newbies learning the unixoid command line
are helped by consistency. (That's why in toybox everything accepts --
even "echo", regardless of what ubuntu did.) If future generations
learning this stuff _don't_ outnumber the current installed base, we're
doing it wrong.

And if we are going to clean up this sort of thing, the initial
introduction is the place to do it. If people need it, they'll complain.
(Later people would complain just because it changed.)

I can see you not wanting to field the bug reports, though. :)

Rob

P.S. Remember how I disabled --help output for "true" and "false"
because people complained? Bash has built-in "true" and "false"
implementations that behave like toybox does now, but ubuntu _also_ has
/bin/false and /bin/true behave like toybox _used_ to, and yes:

  $ /bin/true --help > /dev/full
  /bin/true: write error: No space left on device
  $ echo $?
  1

Moral of the story: the gnu/gnu/gnu/dammit stuff is not compatible with
_itself_. Second moral of the story:

  $ /bin/true --help | wc -l
  15
  $ man true | wc -l
  50

Yet that man page ends with:

  SEE ALSO
    The  full documentation for true is maintained as a Texinfo
    manual.  If the info and true programs are properly installed
    at your site, the command

       info coreutils 'true invocation'

    should give you access to the complete manual.

Because 15 lines of built-in help text plus 50 lines of man page is not
enough to fully describe the "true" command, for the _full_
documentation they need their bespoke proprietary documentation format
nobody else uses but they refuse to give up.

This sort of thing is why I don't see gnu versions of anything as "the
real thing", more what people used in the bad old days before we knew
better, uphill, both ways, in the snow.

Still Rob



More information about the Toybox mailing list