[Toybox] Working on unxz-some questions
Rob Landley
rob at landley.net
Sun Mar 3 19:45:27 PST 2013
On 03/01/2013 01:10:36 AM, Isaac Dunham wrote:
> I'm looking into adding an unxz based on xz-embedded, which is public
> domain.
Cool!
I noticed this recently (due to the busybox thread about it) and was
pondering the same myself. I downloaded the git repo but am not going
to have time to look at it any time soon, happy somebody else is taking
a look at it. :)
> However, I'm wondering about some things.
> Basically, I get the impression that some (most? all?) of the
> compile-time options
> may not be reasonable.
Toybox's primary design goal is simplicity. Complexity is a limited
resource that we spend on implementing features, increasing speed, and
reducing size, but everything we do has to be worth the complexity cost.
> 1) xz allows several filters to improve compression of executables
> (BCJ filters).
> Should all of these be turned on unconditionally, or should it be
> user-selectable?
> The native BCJ filter for each arch is probably necessary for
> compatability reasons,
> but I'm wondering about alternative ones (eg, should we enable sparc
> BCJ filters
> everywhere?)
On kernel.org there are tar.gz files, tar.bz2 files, and tar.xz files.
Our decompressor has to handle all of that.
On the compression side, we've got a quick streaming compressor already
(gzip) which gets the low hanging fruit of compression and is going to
be faster than anything else (fits in L1 cache a lot of the time), so I
believe the main advantage of xz is _better_ compression? (Correct me
if I'm wrong here, I don't use it much...)
I agree that 8 gazillion knobs isn't really what toybox is good at.
> 2) I assume that CRC64 support should be unconditional. Upstream
> recently added
> crc64, but it's optional there.
Compatability with existing and future data files is the important
thing.
> 3) Should unsupported integrity checks be ignored, cause an error, or
> should
> this be a compile-time option?
On the compressor side or on the decompressor side?
On the decompressor side I'd probably just ignore them. We're going to
have at least crc32, right? And then tar will internally have some
basic "this is not a valid tar file" check...
> I'm assuming that even if we can't check, we should still decompress.
Doing the best we can to work with the input we're given, yes.
> Also, (assuming that at least one of the above should be
> configurable) should the
> xz library part be configurable separately from the unxz command?
> This is mainly
> relevant for if you plan to use it to decompress for tar et al.
Hmmm... that's the kind of thing we can clean up later (don't have to
decide right now). Just do the xz command(s) and I'll wire it up to tar
when I get around to doing tar. :)
(It's quite possible the right thing for tar is to just shell out to xz
from the $PATH and pipe stuff through an external command, and if that
command is internal then fork() and xexec() will do the right thing
anyway. The reason this is the right thing is both simplicity of
implementation and because SMP is pretty ubiquitous these days and two
processes are SMP-friendly. If somebody wants to wire this into an
u-boot with no scheduler, they can do it themselves.)
> Is there a way to conditionally compile code in lib/?
Not yet. In theory the gc-sections stuff is dropping out unused code,
so it gets built but not included into the final binary.
In practice, I probably need to redo the build system because the gcc
guys decided that their compiler was just too horrible to make
build-at-once mode actually work, so they save the intermediate parse
results into special ELF sections and then unload the actual code
generation onto the linker, which is called link time optimization and
is a horrible solution. So the "cc *.c" approach I've been doing
doesn't take advantage of SMP and won't because the gcc developers are
incompetent, and I need to work around them (or see if llvm is better).
So for right now, don't worry about it. Just add the file to lib and if
the build gets uncomfortably slow I'll improve it later.
Rob
More information about the Toybox
mailing list