[Toybox] [New toy] rudimentary cpio version...

Rob Landley rob at landley.net
Sun Oct 27 17:23:24 PDT 2013


On 10/14/2013 08:51:34 PM, ibid.ag at gmail.com wrote:
> On Mon, Oct 14, 2013 at 11:38:13AM -0500, Rob Landley wrote:
> > On 09/30/2013 10:50:13 PM, ibid.ag at gmail.com wrote:
> > >I've finally gotten 'cpio' into a shape where it could be useable.
> >
> > And I've finally gotten through my email to the point you submitted
> > this. (I see an update from the web archive, I can check that in as
> > a delta against this.)
> There should be 3 versions in the archive:
> -the first one I sent (this), which was a case of attaching the wrong  
> file
> -the second one I sent, which was the right file:
> http://lists.landley.net/pipermail/toybox-landley.net/2013-September/001387.html
> http://lists.landley.net/pipermail/toybox-landley.net/attachments/20130930/a4c67169/attachment.c
> -the last update, which fixes an issue with symlinks and special files
> that we can stat but not read (thereby losing a couple lines of code):
> http://lists.landley.net/pipermail/toybox-landley.net/2013-October/001408.html
> http://lists.landley.net/pipermail/toybox-landley.net/attachments/20131012/768acc12/attachment.c

I merged that one. That's the one where the description said it reduced  
malloc usage and I couldn't spot how, but being able to handle  
unreadable dev nodes and such is nice.

> >
> > >This version can archive and extract directories, sockets, FIFOs,
> > >devices,
> > >symlinks, and regular files.
> > >Supported options are -iot, -H FMT (which is a dummy right now).
> >
> > I want to tweak the help text but will hold off until I catch up to
> > the newer message about this...
> >
> > >It only writes newc, and could read newc or newcrc.
> >
> > What's "newcrc"?
> 
> newcrc was supposed to be newc + crc (ie, the c_check field contains a
> checksum instead of "00000000").
> But they messed up and filled it with a count of nonzero bits, which  
> is
> pretty weak.

That's... impressive.

(Generally the compression wrapper handles data integrity, but... dude.)

> >
> > >This does NOT implement -d, which essentially is equivalent to
> > >mkdir -p $(dirname $FILE)
> > >for every file that needs it.
> >
> > Let's see, the examples I gave in ramfs-rootfs-initramfs.txt are:
> >
> >   cpio -i -d -H newc -F initramfs_data.cpio --no-absolute-filenames
> >
> >   find . | cpio -o -H newc | gzip
> >
> > It would be nice to get that example working, although...
> > --no-absolute-filenames? really? There's no short for that? (I
> > expect just NEVER doing absolute filenames is the right thing, and
> > if you _want_ that you cd to the root directory yourself and extract
> > it from there...)
> 
> The quick way to do that is to increment name until name[0] != '/'.
> 
> > >Hard links are not supported, though it would be easy to add them
> > >given
> > >a hash table or something like that.
> >
> > I've been thinking of doing that for tar and cp and such, and
> > somebody submitted code that was doing something like that
> > already... (Sigh, pending is getting overwhelming, I don't remember
> > where.)
> 
> That was tcpsvd.

Actually I was referring to:

http://lists.landley.net/pipermail/toybox-landley.net/2013-September/001315.html

But I'd rather try to use the builtin select timeout than use alarm  
unnecessarily and layer lots of signal handling on top to try to make  
it "generic" when all the callers are going to be our code anyway...

> > My first guess at this is to do a simple linked list (scales well to
> > a couple thousand entries) and then upgrade it to a balanced binary
> > tree of some kind later. (Hash tables tend to need manual tuning for
> > good performance, hash function and number of buckets and so on.)
> >
> > But for right now we can leave it as a todo item.
> >
> > >I also have not implemented the "<n> blocks" output on stderr.
> > >If desired, I can add it pretty simply.
> >
> > Sigh. Is there any sort of spec on this? SUSv4 utilities doesn't
> > list cpio or tar, and
> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
> > is a "Jorg Schilling is on the posix committee" level Solaris
> > disaster.
> >
> > I'd guess it's desirable, though. (This might be a place where
> > looking at the busybox --help output is justified to see what subset
> > they chose. :)
> 
> OK, it's in LSB.
> LSB 4.1 says "See SUSv2"
> http://pubs.opengroup.org/onlinepubs/7908799/xcu/cpio.html
> Which lists:
> cpio -o[aBcv]
> cpio -i[Bcdmrtuvf] [pattern ...]
> cpio -p[adlmuv] directory

So there -was- a cpio standard, and in response to the kernel guys  
adding initramfs based on cpio in December 2001, posix removed the  
command from the 2003 standard.

*slow clap*

Bravo, posix. Bravo.

> cpio -p is a fancy way of doing cp -R, so let's ignore it.
> That leaves aBcdmrtuvf
> of those, -a changes atime (why?), -c is a nop or means adding
> old binary/old character format.
> -B is blocksize (we'd ignore it), so rf are the meaningful ones beyond
> what busybox has.

The meaningful use cases I'm aware of are:

1) initramfs
2) rpm

I

> Busybox has the "<n> blocks" and also implements -dmvuF.
> -F is
>   if (TT.fname) {
>     close(toys.optflags & (FLAG_i|FLAG_t) ? 0 : 1);
>     xopen(TT.fname, TT.optflags & FLAG_o ? O_CREAT|O_WRONLY|O_TRUNC :
>   O_RDONLY);
>   }
> 
> -d needs the logic from mkdir -p
> -m would be close to touch -m in logic

The -m stuff's reasonably generic in the library already.

> -v is pretty simple.
> -u is the current behavior; I'd need to check before creating if it
> isn't set...

I've added ignored dummy flags before. If you write a diff that only  
does -u, then accepting -u doesn't hurt anything. :)

> > >There is one assumption this makes: that the mode of a file, as
> > >mode_t,
> > >is bitwise equivalent to the mode as defined for the cpio format.
> > >This is true of Linux, but is not mandated by POSIX.
> >
> > Posix does actually mandate an octal representation for mode bits,
> > as a table in the chmod command line description:
> >
> > http://pubs.opengroup.org/onlinepubs/9699919799/utilities/chmod.html
> >
> > It doesn't say the OS has to use those _internally_, but everybody
> > does because unix version 7 circa 1975 did.
> 
> > Merged it in pending here, trying to catch up on email...
> 
> Thanks,
> Isaac Dunham

still catching up...

Rob
 1382919804.0


More information about the Toybox mailing list