[Toybox] archivers

ibid.ag at gmail.com ibid.ag at gmail.com
Tue Dec 3 22:03:50 PST 2013


On Sun, Dec 01, 2013 at 01:50:40PM -0600, Rob Landley wrote:
> This is quite possibly stale, but just in case, lemme finish this
> half-reply from September:

Yes, it's stale, but some of it still pertains (loopfiles_stdin).
The email was about the first draft/WIP cpio code.

> On 09/13/2013 12:42:19 AM, Isaac wrote:
> >On Mon, Aug 12, 2013 at 12:57:48PM -0500, Rob Landley wrote:
> >> On 08/12/2013 12:30:05 AM, Isaac wrote:
> >> >On Sun, Aug 11, 2013 at 10:07:49PM -0500, Rob Landley wrote:
> >> >> On 08/11/2013 07:42:37 PM, Isaac wrote:
> >> >> >I also have some code here that *should* handle writing a member
> >> >> >of a cpio
> >> >> >newc archive, loopfiles_stdin(), and a general idea of how to
> >> >proceed.
> >> >> >Don't expect much soon.
> >
> >Right now I have code that will write a newc archive,
> >but nothing to read it.
 
> Will the kernel's initramfs stuff extract it? That's the biggest
> real-world user.

Yes. I admit I just tested that.

In the process, I've noticed/remembered that
-toybox doesn't have openvt yet, so I may work on that.
-the init version we have is pretty much like busybox init in behavior,
and even more poorly documented...so I wrote an example inittab, to be
sent sometime soon.
-syslogd, dhcpd, and possibly a couple other daemons/tools have
undocumented configuration files.
-the "mount" version I've hacked out of Ashwini Sharma's code is
688 lines, so possibly of a size to be interesting. (Yes, more than half
was the NFS code.)
-some sort of editor is still lacking, but that can wait.




> There's also the stanza I put in ramfs-rootfs-initramfs.txt in the
> kernel documentation:
> 
>   cpio -i -d -H newc -F initramfs_data.cpio --no-absolute-filenames

Ah yes. 
I have some code that theoretically would handle
--no-absolute-filenames.
Lovely what that long option does to the preprocessor ;).

> >(I note that loopfiles_stdin() is currently not as robust as is
> >desirable;
> >it handles at most 4094-char lines because it fgets() into toybuf,
> >needs a null terminator and a newline, and removes the newline.
> >If there's no newline in a 4095-byte read, it ignores the line
> >and starts munching bytes until it hits a \n or the end of file.
> >But for regular systems where you rarely have a path
> >more than 80 chars long, it's...useable.)
> 
> Yeah, that needs fixing. Although PATH_MAX is 4096 on Linux (and
> despite the openat() stuff that's still in use in some places).

Still needs fixing...having not run into 4095+byte paths yet, I've
not had much reason to fix it, or opportunity to test, and I'm not
certain what's the best route...
Some options I can think of:

A-skip line loudly
(fprintf(stderr,"%s",toybuf) until we have the newline)

B-xmalloc() an 8k buffer we use in place of toybuf, 
and figure if they use a path that's twice PATH_MAX 
we can skip it loudly.

C-xmalloc a larger buffer when 4095 bytes isn't enough
(but if we guess low, still fail)

D-just loop and increase buffer size for each iteration
It might look something like
while(no_newline){
  bufsiz+=4096;
  newbuf=xmalloc(bufsiz);
  memcpy(newbuf,oldbuf, bufsiz - 4096);
  free(oldbuf);
  oldbuf = newbuf;
  //the read part of the loop goes here
}

I don't like option D at all.  
In theory, you can read arbitrarily long pathnames.
In practice, you're limited to half the free memory, and cpio dies on
one extra-long pathname, even if most of the names are short.
It has a proliferation of moving parts.
And it will be hard to debug.

I'm inclined to go with A for now, and switch to B if anyone runs into
problems. C has 2 special cases, and we have to decide whether or not to
free() the buffer.


Thanks,
Isaac Dunham

 1386137030.0


More information about the Toybox mailing list