[Toybox] [PATCH] Clean up xz a good amount

Rob Landley rob at landley.net
Wed Feb 28 10:36:15 PST 2024


On 2/28/24 11:13, enh wrote:
> On Tue, Feb 27, 2024 at 8:34 PM Rob Landley <rob at landley.net> wrote:
>> >  static size_t bcj_x86(struct xz_dec_bcj *s, char *buf, size_t size)
>> > @@ -639,6 +640,20 @@ enum xz_ret xz_dec_bcj_run(struct xz_dec_bcj *s, struct xz_dec_lzma2 *lzma2,
>> >   */
>> >  enum xz_ret xz_dec_bcj_reset(struct xz_dec_bcj *s, char id)
>> >  {
>> > +  switch (id) {
>> > +  case BCJ_X86:
>> > +  case BCJ_POWERPC:
>> > +  case BCJ_IA64:
>> > +  case BCJ_ARM:
>> > +  case BCJ_ARMTHUMB:
>> > +  case BCJ_SPARC:
>> > +    break;
>> > +
>> > +  default:
>> > +    /* Unsupported Filter ID */
>> > +    return XZ_OPTIONS_ERROR;
>> > +  }
>> > +
>> >    s->type = id;
>> >    s->ret = XZ_OK;
>> >    s->pos = 0;
> 
> ah, crap, that's another thing to put on the riscv64 to-do list...
> (thanks for bringing that to light!)

Which is _conceptually_ a bit difficult because it means a new archiver would be
generating binaries an old archiver couldn't extract.

Which is a breaking change even if all you want to do is extract a java runtime
that has optional accelerators for various architectures with prebuilt binary
sequences, or a "here's the same binary for all supported architectures" tarball
(such as an archive of the AOSP prebuilts directory). Can't skip past data you
don't understand, so an existing archiver could not extract the new archive, and
there's always the possibility for false positives in arbitrary binary data
going past.

I note there's no m68k, s390, hexagon, mips/loongson, or superh optimized binary
compressors in the above list. IBM made a big push to establish a little endian
powerpc64 ABI like 10 years ago, and yet:

>>     BCJ_POWERPC = 5,    /* Big endian only */

No explicit arm64 or x86-64 (do they compress the same?), but it's still got itanic.

Is the optimization worth the incompatibility?

Rob


More information about the Toybox mailing list