[Toybox] [PATCH] cpio: support reading concatenated cpio files.

Yo Chiang big2632 at gmail.com
Fri Apr 16 23:00:23 PDT 2021


Yes I do know where the extra bytes are from 😥. They are from mkbootfs in
the android platform tree (
https://cs.android.com/android/platform/superproject/+/master:system/core/cpio/mkbootfs.c;drc=6ad4d0a601485475645ddd1b23181a4c31754977;l=148).
Which is the tool we use to generate boot ramdisks like gen_init_cpio in
the kernel tree. But unlike gen_init_cpio, mkbootfs pads to 256-byte
boundary.



On Sat, Apr 17, 2021, 06:20 enh via Toybox <toybox at lists.landley.net> wrote:

> funnily enough (or perhaps not, because yochiang knows a lot more about
> cpio than i do...) it looks like the AOSP build is broken because of
> trailing NULs...
>
> here's the previous version of toybox at the end of the ramdisk in
> question:
>
> utimensat(AT_FDCWD, "system/lib64/libstdc++.so", [{tv_sec=0, tv_nsec=0},
> {tv_sec=0, tv_nsec=0}], AT_SYMLINK_NOFOLLOW) = 0
> read(3, "070701000493fe000001ed0000000000"..., 110) = 110
> read(3, "TRAILER!!!\0\0\0\0", 14)       = 14
> close(3)                                = 0
> exit_group(1)                           = ?
>
> and here's the new toybox (ignore the lseek and the write from my
> debugging printf):
>
> utimensat(AT_FDCWD, "system/lib64/libstdc++.so", [{tv_sec=0, tv_nsec=0},
> {tv_sec=0, tv_nsec=0}], AT_SYMLINK_NOFOLLOW) = 0
> lseek(3, 0, SEEK_CUR)                   = 5551408
> read(3, "070701000493fe000001ed0000000000"..., 110) = 110
> read(3, "TRAILER!!!\0\0\0\0", 14)       = 14
> lseek(3, 0, SEEK_CUR)                   = 5551532
> read(3,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 110)
> = 84
> read(3, "", 26)                         = 0
> write(2, "cpio: ", 6cpio: )                   = 6
> write(2, "unexpected EOF where=54b5ac size"..., 35unexpected EOF
> where=54b5ac size=84) = 35
> write(2, "\n", 1
> )                       = 1
> exit_group(1)                           = ?
>
> so we have < sizeof(header) NULs at the end, and the new toybox chokes on
> that.
>
> here's xxd for the end of the file:
>
> 0054b530: 3037 3037 3031 3030 3034 3933 6665 3030  070701000493fe00
> 0054b540: 3030 3031 6564 3030 3030 3030 3030 3030  0001ed0000000000
> 0054b550: 3030 3030 3030 3030 3030 3030 3031 3030  0000000000000100
> 0054b560: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
> 0054b570: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
> 0054b580: 3030 3030 3030 3030 3030 3030 3030 3030  0000000000000000
> 0054b590: 3030 3030 3062 3030 3030 3030 3030 5452  00000b00000000TR
> 0054b5a0: 4149 4c45 5221 2121 0000 0000 0000 0000  AILER!!!........
> 0054b5b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0054b5c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0054b5d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0054b5e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 0054b5f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
>
> for some reason that's 192 bytes? i can dig further, but since it seems
> like yochiang predicted this, i assume that you two already know where
> these NULs are coming from?
>
> On Fri, Apr 16, 2021 at 1:48 PM enh <enh at google.com> wrote:
>
>>
>>
>> On Fri, Apr 16, 2021 at 11:44 AM Yi-yo Chiang <yochiang at google.com>
>> wrote:
>>
>>> I'm not sure what Elliot's goal is? I assume he's trying to extract a
>>> concatenated ramdisk, and I still see a problem in the current solution.
>>>
>>
>> oops, i was going to say "i added you to the bug", but it looks like i
>> added you to the *other* cpio bug (which appears to be resolving itself as
>> "user error, we'll fix our script"). i've added you to
>> https://issuetracker.google.com/184732694 now, but non-googlers won't be
>> able to follow that link.
>>
>> all i want to do is get my rug back^W^W^W^Wkeep someone's existing
>> workflow working.
>>
>>
>>> The buffer-format (
>>> https://www.kernel.org/doc/Documentation/early-userspace/buffer-format.txt)
>>> says:
>>>
>>>   initramfs  := ("\0" | cpio_archive | cpio_gzip_archive)*
>>>
>>> In other words, both `cat a.cpio b.cpio >merged.cpio` and `(cat a.cpio
>>> && echo -n -e '\0\0\0' && cat b.cpio) >merged.cpio` are valid initramfs.
>>>
>>
>> weird. do we know _why_ that's supported?
>>
>>
>>> btw gen_init_cpio.c also pads initramfs to 512-byte boundary (
>>> https://github.com/torvalds/linux/blob/6fbd6cf85a3be127454a1ad58525a3adcf8612ab/usr/gen_init_cpio.c#L97
>>> )
>>>
>>> If we're viewing buffer-format.txt as the "right" cpio spec, then I
>>> think we should implement this too. We should skip arbitrary
>>> extra NUL-bytes padded between cpio file frames
>>>
>>
>> not that it really matters (because i think we're all agreed that "do
>> what the kernel does" makes more sense than "do what GNU/BSD think") but
>> does GNU cpio cope with that? (see "strace" below for why i wonder.)
>>
>>
>>> On Fri, Apr 16, 2021 at 2:27 PM Rob Landley <rob at landley.net> wrote:
>>>
>>>> On 4/16/21 12:18 AM, Rob Landley wrote:
>>>> > On 4/15/21 6:57 PM, Rob Landley wrote:
>>>> >> On 4/15/21 12:06 PM, enh wrote:
>>>> >>> do you already have the "do the right thing" patch ready, or should
>>>> i send that
>>>> >>> today?
>>>> >>
>>>> >> I'm good, I'll try to get it in tonight. The hard part is the test
>>>> suite, not
>>>> >> the implementation.
>>>> >
>>>> > Correction: the hard part is I don't seem to have actually
>>>> implemented hardlink
>>>> > support yet. It just records/extracts them as separate files.
>>>> >
>>>> > Right. I need to test this against the kernel initramfs plumbing, in
>>>> both
>>>> > directions...
>>>>
>>>>      if (!strcmp("TRAILER!!!", name)) {
>>>> +      readall(afd, toybuf, 348);
>>>>
>>>> Where did you get 348 from? (What is this change doing?)
>>>>
>>>> I did what seems like a minimal change to continue past TRAILER!!! but
>>>> error on
>>>> an empty archive, but I apparently don't understand what your change
>>>> did? (I'm
>>>> assuming it fixed the use case you were seeing...?)
>>>>
>>>
>> yeah, from strace i think GNU cpio avoids this misalignment by always
>> reading 512-byte blocks (rather than toybox's "let me just read the header
>> and then worry about anything else later").
>>
>> all i was trying to do was "get back in sync after a TRAILER!!! entry",
>> and 348 (by measurement) got me back to 512 after what toybox cpio had
>> read, for the examples i had. (i had expected "is this guaranteed to be
>> right for all possible valid cpio files?" to be the discussion we'd be
>> having rather than the "shouldn't we just do what the kernel does?" :-) )
>>
>> (if i haven't said so already, assume i know nothing about cpio and have
>> never knowingly used it myself.)
>>
>> interestingly, the current patch seems to break the AOSP build:
>>
>> 2021-04-16 18:56:56 - common.py - WARNING : Unable to get boot image
>> build props: Failed to run command '['toybox', 'cpio', '-F',
>> '/mnt/disks/build-disk/src/googleplex-android/sc-dev/out/soong/.temp/boot_xwvxvq26.img/uncompressed_ramdisk',
>> '-i']' (exit code 1):
>> cpio: bad header
>>
>> i'll chase up the person using `toybox cpio` and move them over to the
>> hermetic prebuilt build toybox later. for now i'll try to reproduce this
>> locally and see what the trouble is...
>>
>>
>>> Rob
>>>> _______________________________________________
>>>> Toybox mailing list
>>>> Toybox at lists.landley.net
>>>> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>>>>
>>>
>>>
>>> --
>>>
>>> Yi-yo Chiang
>>> Software Engineer
>>> yochiang at google.com
>>>
>> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20210417/db13f42f/attachment-0001.htm>


More information about the Toybox mailing list