[Toybox] [PATCH] Clean up xz a good amount

enh enh at google.com
Wed Feb 28 11:03:27 PST 2024


On Wed, Feb 28, 2024 at 10:28 AM Rob Landley <rob at landley.net> wrote:
>
> On 2/28/24 11:13, enh wrote:
> > On Tue, Feb 27, 2024 at 8:34 PM Rob Landley <rob at landley.net> wrote:
> >> >  static size_t bcj_x86(struct xz_dec_bcj *s, char *buf, size_t size)
> >> > @@ -639,6 +640,20 @@ enum xz_ret xz_dec_bcj_run(struct xz_dec_bcj *s, struct xz_dec_lzma2 *lzma2,
> >> >   */
> >> >  enum xz_ret xz_dec_bcj_reset(struct xz_dec_bcj *s, char id)
> >> >  {
> >> > +  switch (id) {
> >> > +  case BCJ_X86:
> >> > +  case BCJ_POWERPC:
> >> > +  case BCJ_IA64:
> >> > +  case BCJ_ARM:
> >> > +  case BCJ_ARMTHUMB:
> >> > +  case BCJ_SPARC:
> >> > +    break;
> >> > +
> >> > +  default:
> >> > +    /* Unsupported Filter ID */
> >> > +    return XZ_OPTIONS_ERROR;
> >> > +  }
> >> > +
> >> >    s->type = id;
> >> >    s->ret = XZ_OK;
> >> >    s->pos = 0;
> >
> > ah, crap, that's another thing to put on the riscv64 to-do list...
> > (thanks for bringing that to light!)
>
> Which is _conceptually_ a bit difficult because it means a new archiver would be
> generating binaries an old archiver couldn't extract.

judging by the code snippet above, that's already happened to you ---
where's your arm64 case?

(personally, since you have a public domain implementation of this,
i'd suggest _not_ making any local changes specifically so it's easy
to just take a new upstream drop whenever.)

> Which is a breaking change even if all you want to do is extract a java runtime
> that has optional accelerators for various architectures with prebuilt binary
> sequences, or a "here's the same binary for all supported architectures" tarball
> (such as an archive of the AOSP prebuilts directory). Can't skip past data you
> don't understand, so an existing archiver could not extract the new archive, and
> there's always the possibility for false positives in arbitrary binary data
> going past.
>
> I note there's no m68k, s390, hexagon, mips/loongson, or superh optimized binary
> compressors in the above list. IBM made a big push to establish a little endian
> powerpc64 ABI like 10 years ago, and yet:
>
> >>     BCJ_POWERPC = 5,    /* Big endian only */
>
> No explicit arm64 or x86-64 (do they compress the same?), but it's still got itanic.
>
> Is the optimization worth the incompatibility?

this is the _decoder_ you're talking about right, right? so you don't
get a choice :-(

(but, yeah, personally i'd not bother with this in the _encoder_. i
should probably point out that the Android OTA folks disagree --- in
their world, every byte costs money but cpu time doesn't [because you
already have to arrange to apply OTAs when the phone isn't in use, and
avoid making it hot, etc --- they go out of their way to run late,
slow, and on the charger anyway]. but Android uses xz-embedded and
xz-java directly anyway, and you cleverly wrote toybox tar to call out
to external binaries, so not even there.)

> Rob


More information about the Toybox mailing list