[Toybox] [PATCH] cpio: support reading concatenated cpio files.
Rob Landley
rob at landley.net
Sat Apr 17 02:38:18 PDT 2021
On 4/16/21 3:48 PM, enh wrote:
> On Fri, Apr 16, 2021 at 11:44 AM Yi-yo Chiang <yochiang at google.com
> <mailto:yochiang at google.com>> wrote:
>
> I'm not sure what Elliot's goal is? I assume he's trying to extract a
> concatenated ramdisk, and I still see a problem in the current solution.
>
>
> oops, i was going to say "i added you to the bug", but it looks like i added you
> to the *other* cpio bug (which appears to be resolving itself as "user error,
> we'll fix our script"). i've added you
> to https://issuetracker.google.com/184732694 now, but non-googlers won't be able
> to follow that link.
Yup, it's a login screen.
> all i want to do is get my rug back^W^W^W^Wkeep someone's existing workflow working.
>
> The buffer-format
> (https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt)
> says:
>
> initramfs := ("\0" | cpio_archive | cpio_gzip_archive)*
>
> In other words, both `cat a.cpio b.cpio >merged.cpio` and `(cat a.cpio &&
> echo -n -e '\0\0\0' && cat b.cpio) >merged.cpio` are valid initramfs.
>
> weird. do we know _why_ that's supported?
Because gnu/cpio is crapping trailing nul bytes at the end of archives it creates:
$ echo toys.h | cpio -o | hd | tail -n 5
2 blocks
000002c0 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
000002d0 0b 00 00 00 00 00 54 52 41 49 4c 45 52 21 21 21 |......TRAILER!!!|
000002e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000400
So if you concatenate them, you get runs of nul bytes.
> btw gen_init_cpio.c also pads initramfs to 512-byte boundary
> (https://github.com/torvalds/linux/blob/6fbd6cf85a3be127454a1ad58525a3adcf8612ab/usr/gen_init_cpio.c#L97)
>
> If we're viewing buffer-format.txt as the "right" cpio spec, then I think we
> should implement this too. We should skip arbitrary extra NUL-bytes padded
> between cpio file frames
>
> not that it really matters (because i think we're all agreed that "do what the
> kernel does" makes more sense than "do what GNU/BSD think") but does GNU cpio
> cope with that? (see "strace" below for why i wonder.)
>
>
> On Fri, Apr 16, 2021 at 2:27 PM Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
>
> On 4/16/21 12:18 AM, Rob Landley wrote:
> > On 4/15/21 6:57 PM, Rob Landley wrote:
> >> On 4/15/21 12:06 PM, enh wrote:
> >>> do you already have the "do the right thing" patch ready, or should
> i send that
> >>> today?
> >>
> >> I'm good, I'll try to get it in tonight. The hard part is the test
> suite, not
> >> the implementation.
> >
> > Correction: the hard part is I don't seem to have actually implemented
> hardlink
> > support yet. It just records/extracts them as separate files.
> >
> > Right. I need to test this against the kernel initramfs plumbing, in both
> > directions...
>
> if (!strcmp("TRAILER!!!", name)) {
> + readall(afd, toybuf, 348);
>
> Where did you get 348 from? (What is this change doing?)
>
> I did what seems like a minimal change to continue past TRAILER!!! but
> error on
> an empty archive, but I apparently don't understand what your change
> did? (I'm
> assuming it fixed the use case you were seeing...?)
>
> yeah, from strace i think GNU cpio avoids this misalignment by always reading
> 512-byte blocks (rather than toybox's "let me just read the header and then
> worry about anything else later").
Because you can't unget into a nonseekable filehandle.
> all i was trying to do was "get back in sync after a TRAILER!!! entry", and 348
> (by measurement) got me back to 512 after what toybox cpio had read, for the
> examples i had. (i had expected "is this guaranteed to be right for all possible
> valid cpio files?" to be the discussion we'd be having rather than the
> "shouldn't we just do what the kernel does?" :-) )
>
> (if i haven't said so already, assume i know nothing about cpio and have never
> knowingly used it myself.)
>
> interestingly, the current patch seems to break the AOSP build:
For some reason the github build test failed, although it ran fine here? And I
couldn't view the actual logs on my phone because Microsoft Github gratuitously
wanted me to log in for no obvious reason. (I never give my phone unnecessary
credentials.)
> 2021-04-16 18:56:56 - common.py - WARNING : Unable to get boot image build
> props: Failed to run command '['toybox', 'cpio', '-F',
> '/mnt/disks/build-disk/src/googleplex-android/sc-dev/out/soong/.temp/boot_xwvxvq26.img/uncompressed_ramdisk',
> '-i']' (exit code 1):
> cpio: bad header
Hmmm... the 4 trailing NUL bytes on the "trailer" entry _should_ just put it
back into alignment (strings are 4 byte aligned)? The old 6-digit magic field
puts the header 2 bytes short, and then the 10 byte "TRAILER!?!" entry WOULD
line up on 4 bytes except it should have a NUL terminator which puts it one
over, thus needing 3 bytes of padding, so 4 trailing NUL bytes?
> i'll chase up the person using `toybox cpio` and move them over to the hermetic
> prebuilt build toybox later. for now i'll try to reproduce this locally and see
> what the trouble is...
Is this a _new_ failure, or is this a failure of the concatenated files it
didn't handle before? (I.E. did my patch just not fix your issue since I don't
have access to that test case to confirm I'd fixed it, or is it now more broken
than it was?)
Rob
More information about the Toybox
mailing list