[Toybox] [PATCH] Clean up xz a good amount

Rob Landley rob at landley.net
Wed Feb 28 11:49:31 PST 2024


On 2/28/24 13:03, enh wrote:
> On Wed, Feb 28, 2024 at 10:28 AM Rob Landley <rob at landley.net> wrote:
>>
>> On 2/28/24 11:13, enh wrote:
>> > On Tue, Feb 27, 2024 at 8:34 PM Rob Landley <rob at landley.net> wrote:
>> >> >  static size_t bcj_x86(struct xz_dec_bcj *s, char *buf, size_t size)
>> >> > @@ -639,6 +640,20 @@ enum xz_ret xz_dec_bcj_run(struct xz_dec_bcj *s, struct xz_dec_lzma2 *lzma2,
>> >> >   */
>> >> >  enum xz_ret xz_dec_bcj_reset(struct xz_dec_bcj *s, char id)
>> >> >  {
>> >> > +  switch (id) {
>> >> > +  case BCJ_X86:
>> >> > +  case BCJ_POWERPC:
>> >> > +  case BCJ_IA64:
>> >> > +  case BCJ_ARM:
>> >> > +  case BCJ_ARMTHUMB:
>> >> > +  case BCJ_SPARC:
>> >> > +    break;
>> >> > +
>> >> > +  default:
>> >> > +    /* Unsupported Filter ID */
>> >> > +    return XZ_OPTIONS_ERROR;
>> >> > +  }
>> >> > +
>> >> >    s->type = id;
>> >> >    s->ret = XZ_OK;
>> >> >    s->pos = 0;
>> >
>> > ah, crap, that's another thing to put on the riscv64 to-do list...
>> > (thanks for bringing that to light!)
>>
>> Which is _conceptually_ a bit difficult because it means a new archiver would be
>> generating binaries an old archiver couldn't extract.
> 
> judging by the code snippet above, that's already happened to you ---
> where's your arm64 case?

I haven't updated in forever. (It's on the todo list, but pending...)

> (personally, since you have a public domain implementation of this,
> i'd suggest _not_ making any local changes specifically so it's easy
> to just take a new upstream drop whenever.)

Eh, yes and no. It's 3000 lines, and should probably be less than 1000. (My
bzcat implementation is 729 lines, and lib/deflate.c is 540.) If I can shrink it
by half, I should probably do so.

I need to track "git log" from the last known "upstream" version to see what
they _changed_ and apply interesting fixes (like arm64 support, apparently).
Promoting the command should add that step to the toybox release checklist. But
first I need to go back and see what git version got submitted in the first place...

http://lists.landley.net/pipermail/toybox-landley.net/2013-February/013755.html
http://lists.landley.net/pipermail/toybox-landley.net/2013-March/013760.html

commit 971d57ec4a9e14527e7582a5723d9634182d3fa7
Author: Rob Landley <rob at landley.net>
Date:   Fri Mar 15 20:16:25 2013 -0500

    Isaac Dunham took the public domain xz-embedded code and made an xzcat.
    I glued all his files together into one big one and threw it in pending.
    It needs something between cleanup and a complete rewrite.

Probably a bit of bisecting in the archive to identify the version with the
smallest diff...

>> No explicit arm64 or x86-64 (do they compress the same?), but it's still got itanic.
>>
>> Is the optimization worth the incompatibility?
> 
> this is the _decoder_ you're talking about right, right? so you don't
> get a choice :-(

If it's been already added upstream, then no I don't.

> (but, yeah, personally i'd not bother with this in the _encoder_. i
> should probably point out that the Android OTA folks disagree --- in
> their world, every byte costs money but cpu time doesn't [because you
> already have to arrange to apply OTAs when the phone isn't in use, and
> avoid making it hot, etc --- they go out of their way to run late,
> slow, and on the charger anyway]. but Android uses xz-embedded and
> xz-java directly anyway, and you cleverly wrote toybox tar to call out
> to external binaries, so not even there.)

It's intentionally modular. It mixes and matches with other implementations so
you don't HAVE to use all of it if you don't want to.

(One glaring exception to this is toybox's tar --xform requiring toybox's sed. I
feel bad about that, but the alternatives I could think of were even uglier.)

>> Rob

Rob


More information about the Toybox mailing list