[Toybox] [PATCH] Clean up xz a good amount

Oliver Webb aquahobbyist at proton.me
Wed Feb 28 15:03:22 PST 2024


On Wednesday, February 28th, 2024 at 13:51, Rob Landley <rob at landley.net> wrote:
> On 2/28/24 13:49, Rob Landley wrote:
>
> > I need to track "git log" from the last known "upstream" version to see what
> > they changed and apply interesting fixes (like arm64 support, apparently).
> > Promoting the command should add that step to the toybox release checklist. But
> > first I need to go back and see what git version got submitted in the first place...
> >
> > http://lists.landley.net/pipermail/toybox-landley.net/2013-February/013755.html
> > http://lists.landley.net/pipermail/toybox-landley.net/2013-March/013760.html
> >
> > commit 971d57ec4a9e14527e7582a5723d9634182d3fa7
> > Author: Rob Landley rob at landley.net
> > Date: Fri Mar 15 20:16:25 2013 -0500
> >
> > Isaac Dunham took the public domain xz-embedded code and made an xzcat.
> > I glued all his files together into one big one and threw it in pending.
> > It needs something between cleanup and a complete rewrite.
> >
> > Probably a bit of bisecting in the archive to identify the version with the
> > smallest diff...
>
>
> Checking the git log against the submission date it's almost certainly one of these:
>
> commit 94d107ea0ce2772359ee7d09041abd920ec8b8bb
> Author: Lasse Collin lasse.collin at tukaani.org
>
> Date: Mon Apr 15 19:42:17 2013 +0300
>
> Add support for MSVC in xz_config.h.
>
> Thanks to Luke Deller for the original patch.
>
> commit 25a0224510ba143251e6df122b649b3b3b0b0257
> Author: Lasse Collin lasse.collin at tukaani.org
>
> Date: Wed Feb 27 09:34:49 2013 +0200
>
> Document integrity check #defines in README.
>
> commit 0568cfabccc8a23b4d4a23266b39bf14134df434
> Author: Lasse Collin lasse.collin at tukaani.org
>
> Date: Wed Feb 27 09:28:55 2013 +0200
>
> Add optional support for CRC64.
>
> commit e111c275da8b88749dc9cc8d2adce9872a611b89
> Author: Lasse Collin lasse.collin at tukaani.org
>
> Date: Sat Mar 31 10:39:09 2012 +0300
>
> Mention xzminidec.c in README.
>
>
> Most likely it's the 2012 commit with the one year gap after it. It was
> submitted in march so the April commit at the top happened after the submission,
> and the integrity check #defines commit only touched the README, so the question
> is does the submission have CRC64 or not... and yes it does.
>
> So it came from https://git.tukaani.org/xz-embedded.git commit 0568cfabccc8.
> Which means:
>
> $ git log | grep '^commit' | sed '/0568cfabccc8/Q' | wc -l
> 31

$ git clone https://git.tukaani.org/xz-embedded.git/
[...]
$ cd xz-embedded && git log | grep '^commit' | sed '/0568cfabccc8/Q' | wc -l
36

> 31 commits to examine for interesting updates. I'll throw it on the todo heap,
> unless Oliver wants to take that one. :)

$ git log | grep "^commit" | sed '/0568cfabccc8/Q'| cut -d' ' -f2 | xargs -I {} git show {}

commit 1b6defd544914bfb4065e343296e5db64ef400e6: Documentation Update; Irreverent
commit f6d1f58f36cdcc55cbb3599048629f32a5ecb914: Update email address, might wanna change the command header
commit dc835be6a58a097eefc66dac34be7ba7cfbd8719: More documentation, irrelevant
commit 7968516901ab8e02c6d5c97574d2803c56c12489: .gitattributes; Irrelevant
commit f2090365d4cdc86020beb1ac8aea43444cf2456c: README change; Irrelevant
commit a5390fd368f8a58401c0ce0dfb9d05ef3046e4a3: First change of code:
 #      if defined(XZ_DEC_X86) || defined(XZ_DEC_POWERPC) \
                        || defined(XZ_DEC_IA64) \
                        || defined(XZ_DEC_ARM) || defined(XZ_DEC_ARMTHUMB) \
-                       || defined(XZ_DEC_SPARC)
+                       || defined(XZ_DEC_SPARC) || defined(XZ_DEC_ARM64)
To: "Fix the build when only the ARM64 filter is enabled."
It's for a removed ifdef though, Irrelevant 
commit 55c8039c7ff3590671131158c0ef55aa65d826c9: Removes duplicate in a ifdef that I removed; Irrelevant
commit d89ad8130128d71c773f5e50e356562a506f843e: Makefile CFLAGS Update for ARM64 BCJ
commit c66c890bc837a4f522c8961b18badafbd51e1f2e: README update on how to make the kerenel module; Irrelevant
commit 961d094e9242b665ce2444bca5f6f3a2d07c46ae: Kconfig change, "mostly reverts 567636fb219937cec273ba15f92e635f5b84cd4e"; Irrelevant
commit 89094f05f02bcd381511f252b6b4d73db1d70f12: (!!!) Adds the ARM64 BCJ encoder
commit 3f438e15109229bb14ab45f285f4bff5412a9542: Typos in comments; Irrelevant
commit c61e095215ead4506c7ce775110baf9854d481a3: More typo fixing; Irrelevant
commit 8f3ed8b1759abe53ff21f6d9eee1b341e8540e8e: (!!!) Adds a MicroLZMA decoder, From the commit log:
    MicroLZMA is a yet another header format variant where the first
    byte of a raw LZMA stream (without the end of stream marker) has
    been replaced with a bitwise-negation of the lc/lp/pb properties
    byte. MicroLZMA was created to be used in EROFS but can be used
    by other things too where wasting minimal amount of space for
    headers is important.

    This is implemented using most of the LZMA2 code as is so the
    amount of new code is small. The API has a few extra features
    compared to the XZ decoder. On the other hand, the API lacks
    XZ_BUF_ERROR support which is important to take into account
    when using this API.

    MicroLZMA doesn't support BCJ filters. In theory they could be
    added later as there are many unused/reserved values for the
    first byte of the compressed stream but in practice it is
    somewhat unlikely to happen due to a few implementation reasons.
commit 03d0415b7a4a3616e820e08f39f5309d6d32047b: Seems irrelevant for toybox; since we do unsigned chars
    "This might matter, for example, if the underlying type of
    enum xz_check was a signed char. In such a case the validation
    wouldn't catch an unsupported header."
commit 8122033d26644f970ca192466487218c06a1011e: Moves a variable initialization; Not important
commit 41e657bfaa84cde5907020b7032d58f9245fc26b: Typo; Irrelevant
commit 6f0e0c41e3682254c2e0be245f275f77df821ffe: (!!!) "Add xz_dec_catrun() to support concatenated .xz files."
commit d8a12bc0c61282b38439ee76b05dbde0200002e1: Adding -Wno-long-long to CFLAGS; Irrelevant
commit ef038b9db55bba73e2574ae451d62e16ce9c0ef9: C89 Compat: (Probably) Irrelevant
commit 82078b6109122ede1f76b76e75e54dcea7fc8d25: Detect read errors from stdin. Nice but not too important
commit 090e6a054d6283b144d20f5783852b95eade90ee: Documentation (s/http/https/g); irrelevant
commit 49443879b6ae9e54aabfb3e89274969cd8d8a12e: Typo; Irrelevant
commit cfc1499e9fc23d8caa6dfdf1cc3ccf60d6fcd947: "Avoid overlapping memcpy() with invalid input with in-place decompression.":
    With valid files, the safety margin described in lib/decompress_unxz.c
    ensures that these buffers cannot overlap. But if the uncompressed size
    of the input is larger than the caller thought, which is possible when
    the input file is invalid/corrupt, the buffers can overlap. Obviously
    the result will then be garbage (and usually the decoder will return
    an error too) but no other harm will happen when such an over-run occurs.

    This change only affects uncompressed LZMA2 chunks and so this
    should have no effect on performance.
commit 40d291b609d0cc6344f3e26ed34b4fd755e403da: "Fix XZ_DYN_ALLOC to avoid useless memory reallocations."
commit 525549dce62c134ebe26255deab0eb795d92599d: .gitignore; Irrelevant 
commit e72590b57929619d29728f43d8e718665078ea66: Build Process change; Irrelevent
commit 79b68de5657beecfad575578a7181cf6fca869cb: Comments: Annotation of source code flow; Irrelevant
commit a2db597faff76c89d0ba419e2217a90c18d94d2b: Comments: Annotation of source code flow; Irrelevant
Also:
    xzembed: add fallthrough annotations to fix build with GCC7
Comments effect control flow, isn't the point of comments that they do NOTHING?
commit e75f4eb79165213a02d567940d344f5c2ff1be03: Test suite change; Irrelevant 
commit df9d444ef33010ba63ee4a46bf480c3e5d4b6db5: "Add a new line after error message"; Nice but Not Important
commit ee2a443c79afbff48291f133acd42d95861cf721: Comments: Annotation of source code flow; Irrelevant
commit 6a8a2364434763a033781f6b2a605ace9a021013: Typo fix, Irrelevant
commit 567636fb219937cec273ba15f92e635f5b84cd4e: Kconfig change, Irrelevant 
commit 94d107ea0ce2772359ee7d09041abd920ec8b8bb: "Add support for MSVC in xz_config.h." Some ifdefs
commit 25a0224510ba143251e6df122b649b3b3b0b0257: README; Irrelevant


Here's the important ones:
commit 89094f05f02bcd381511f252b6b4d73db1d70f12: Adds the ARM64 BCJ encoder
commit 8f3ed8b1759abe53ff21f6d9eee1b341e8540e8e: Adds a MicroLZMA decoder
commit 6f0e0c41e3682254c2e0be245f275f77df821ffe: "Add xz_dec_catrun() to support concatenated .xz files."

And the ones that aren't so obviously important but notable anyways:
commit f6d1f58f36cdcc55cbb3599048629f32a5ecb914: Update email address, might wanna change the command header
commit 82078b6109122ede1f76b76e75e54dcea7fc8d25: Detect read errors from stdin
commit cfc1499e9fc23d8caa6dfdf1cc3ccf60d6fcd947: "Avoid overlapping memcpy() with invalid input with in-place decompression."
commit 40d291b609d0cc6344f3e26ed34b4fd755e403da: "Fix XZ_DYN_ALLOC to avoid useless memory reallocations."
commit df9d444ef33010ba63ee4a46bf480c3e5d4b6db5: "Add a new line after error message";
commit 94d107ea0ce2772359ee7d09041abd920ec8b8bb: "Add support for MSVC in xz_config.h." Some ifdefs

-   Oliver Webb <aquahobbyist at proton.me>


More information about the Toybox mailing list