[Toybox] [PATCH] Clean up xz a good amount

Rob Landley rob at landley.net
Tue Feb 27 18:08:32 PST 2024


On 2/27/24 19:23, Oliver Webb wrote:
> The below patch does a bunch cleanup on xzcat.c.
> It resolves the ifdefs that were in the code (But everything they were checking was defined so they didn't _do_ anything)
> Adds Tests for errors (And while we are changing the test suite, converts testing -> testcmd)
> Rearange/Rewrite large comments to be smaller (and sometimes C99!).
> It puts main at the bottom and do_xzcat above it, the way code is usually arranged.. 
> Removes some function prototypes and the inline keyword (useless in modern C since it doesn't force anything)
> Removes "!= 0" (And turns "x == 0" to "!x")
> Some types replaced with their LP64 counterparts (uint8_t to char, etc)
> 
> Yes, I know that this is large for a patch, but a lot of that is shuffling code/comments. None of the logic has been
> changed. When all is said and done, this patch shrinks xzcat.c by over 10%

Applied (out of the list's spam filter, "message body too big"), but you've
introduced a glitch (missing flush maybe). When I tested it against the xz files
I had lying around via "xzcat $FILE | tar tv", one file failed with your patch
and succeeded before it. I copied it to my web server so you can take a look:

  wget https://landley.net/musl-cross-make.tar.xz

(Yeah, it's 200 megs. Sorry about that. That was the first one that went "boing".)

Also, I note that there is an "upstream" for this code, which may have changed
since I merged this (many moons ago), and any upstream changes are probably
worth a look:

  https://git.tukaani.org/xz-embedded.git

The one in pending is an old version of that glued together into one file with a
config header, NEWTOY() and command_main() wrapper. (Yes, the upstream is public
domain! Extractor only, not compressor. I haven't found a public domain xz
compressor yet.)

I'm also unclear on the difference between xz and 7zip, and the linux kernel's
"general setup menu" also offers lzma, lzo, lz4, and zstd as initramfs
compression options. I just know kernel.org has tar.xz files, and
https://linuxfromscratch.org/lfs/view/12.0/chapter03/packages.html has .gz,
.bz2, and .xz tarballs, hence those are the three formats toybox can decompress.

It should be able to _create_ gz and zip, which is also what
https://github.com/landley/toybox/tags offers to create. (Both of those are
deflate, I know how and wrote one in java once long ago. I just got sidetracked
on when to do dictionary resets to match the hashes of existing tarballs. The
answer is probably "nuts to your white mice" and just do it every 250k or
something, and oh well the hashes don't match but you get the data back and
that's the important thing. But first I wanted to grab a bunch of existing .gz
tarballs and CHECK when they do dictionary resets, because maybe it IS already
something simple like that.)

Rob

P.S. Alas, I don't make a habit of extensively reading incompatibly licensed
code before writing a 0BSD version, so this kind of question can linger a while
on the todo list when I could theoretically just go track it down in the source...


More information about the Toybox mailing list