[Toybox] realpath/tar status

Rob Landley rob at landley.net
Mon Feb 6 19:48:00 PST 2023



On 2/6/23 12:07, enh wrote:
>> > -*-
>> >
>> > as for the new tar, i updated the prebuilts yesterday and we've seen
>> > enough OOM kills since that i've had to revert it.
>>
>> Grrr. I've got a couple of suspects for what I screwed up, but... Lemme see if I
>> can work the ASAN leak detector into my workflow at all here. (There's a lot of
>> commands that intentionally leak resources on the way out because the OS will
>> free them, but there's a category of commands that should NOT do that because
>> they process theoretically unbounded input and cannot be allowed to leak-per.)

I haven't wired up the ASAN leak detector yet.

>> > every case i saw
>> > was from https://cs.android.com/android/platform/superproject/+/master:system/update_engine/Android.bp;l=947?q=ue_unittest_disk_imgs

I wget that, and it was html.

>> > so basically:
>> > ```
>> > tar -jxf sample_images/sample_images.tar.bz2 -C gen disk_ext2_1k.img
>> > disk_ext2_4k.img disk_ext2_4k_empty.img disk_ext2_unittest.img
>> > ```
>> >
>> > (direct link to the .tar.bz2:
>> > https://android.googlesource.com/platform/system/update_engine/+/refs/heads/master/sample_images/)

I wget _that_ and it was html.

>> Wait, OOM on _extract_? Did I change the extract codepath? (That's sounding more
>> like I damaged dirtree in the non-breadth case...)
> 
> yeah, that didn't make much sense to me either, and i should have
> explicitly said "i'm seeing tar killed by SIGKILL and _assume_ that's
> the oom killer". but -- now i've had time to repro -- it looks like
> suicide instead:

I committed a fix in the --sort path that in theory removed a memory leak, and
removed an "always add null terminator to empty directories thus dereferencing
null" thinko that was disguised by the memory leak.

But when I stress tested "tar c /home | tar t" (over 1.5 terabytes of files
including an AOSP checkout) I don't like the memory behavior, it was up to 256
megs used when I killed it. (Possibly just glibc's horrible heap, but I wanted
to poke at it more to be sure.)

I still haven't fiddled with the extract path, that I know...

> ~/aosp-master-with-phones$ ./out/host/linux-x86/bin/toybox tar -jxf
> system/update_engine/sample_images/sample_images.tar.bz2 -C
> out/soong/.temp/sbox/c744f840647dbf476d005753d9a7965d5af2771b/out/gen
> disk_ext2_1k.img disk_ext2_4k.img disk_ext2_4k_empty.img
> disk_ext2_unittest.img
> tar: had errors

I didn't manage to grab your test image on friday because the THIRD link I tried
https://android.googlesource.com/platform/system/update_engine/+/refs/heads/master/sample_images/sample_images.tar.bz2
is still an html file with no obvious "download the darn file" link. There are
"txt" and "json" links at the bottom, but the txt was not a bzip2 file.

I just went and looked at the code instead, on the theory this worked before so
must have been part of the recent changes...

> read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 512) = 512
> kill(3992265, SIGKILL)                  = 0
> wait4(3992265, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3992265
> exit_group(0)                           = ?
> +++ exited with 0 +++

... OOM killer maybe?

> (yes, that pid is legit. now you see why i needed the column size fix
> for ps/top :-) )
> 
> (note that the command line there should give you convenient repro
> steps --- you don't need a whole AOSP tree, just the .tar.bz2 i linked
> to in the first post.)

Ok, I backed up to "update engine" link which told me the git clone invocation
for this repo, and the resulting 485 files totaling 33 megabytes did include the
relevant file, although at 6k you probably could have just attached it.

Let's see...

$ mkdir sub && cd sub
$ ~/toybox/toybox/tar xvf ../sample_images.tar.bz2
disk_ext2_1k.img
disk_ext2_4k.img
disk_ext2_4k_empty.img
disk_ext2_unittest.img
disk_sqfs_empty.img
disk_sqfs_default.img
disk_sqfs_unittest.img
$ echo $?
0

Seems to have worked with the test binary I made before that last checkin?
Unless the -C or explicit -j were doing something weird, or I need more to the
reproduction sequence. (Built with NDK or musl...?)

Rob


More information about the Toybox mailing list