[Toybox] [PATCH 2/2] Detect all errors with xreadall/xwrite.

Rob Landley rob at landley.net
Mon Aug 9 02:35:37 PDT 2021


Still sick, sorry this is slow and unfocused.

On 8/7/21 7:12 AM, Samanta Navarro wrote:
> If xreadall is called with SIZE_MAX as length, then no error is
> detected.

Sounds like xreadall should be using int len.

You're right it's inconsistent, and that's a problem, but the fix is probably to
make the data granularity manageable.

> The problem is that readall return value is not explicitly tested for
> -1. If len is SIZE_MAX and readall returns -1, then the return value is
> promoted to size_t and both values are considered to be equal.

Define "invalid"?

$ truncate -s 1e boing
$ tac boing

An exabyte's gonna take a bit to churn through. Well...

  $ toybox truncate -s 1e boing
  truncate: 'boing' to '1152921504606846976': File too large

ext4 can't even handle an exabyte file (even sparse), but if I look up what the
maximum size it can handle is and then tell tac to deal with that many NUL bytes
in a contiguous allocation on any piece of hardware I have EVER personally used
(even the Cray systems), bad things are likely to happen.

You're just arguing about where the bad things kick in, not eliminating them.

> It is unsafe to call xreadall with SIZE_MAX because it ultimately means
> that no bound checking for buf is in place.

There should be an error_exit() if the size is too big, yes. Where to add those
checks is what I'm currently too sick to reason through. :(

There are a lot of apis other than read: mmap() and readv() and splice() and so
on. If you're dealing with data bigger than a couple gigabytes, a single
contiguous read() and write() probably aren't how you want to fling it around.

> 64 bit systems return EFAULT
> when SIZE_MAX reads are performed but 32 bit systems actually start
> reading!
> 
> I have added an if statement to protect 32 bit systems from such a read
> by following style of 64 bit error handling.

Neither musl libc nor bionic support non-largefile opens.

> The same logic applies to xwrite as well.
> ---
> Proof of Concept (compile Toybox with ASAN=1):
> 
> cat > poc.tar.xz.b64 << EOF
> /Td6WFoAAATm1rRGAgAhARwAAAAQz1jM4Cf/AGRdADedyhr9nOMBzEAy3Ecv/5ryFigBxe/ACXcH
> zFn6/mBaXBzyT3AoV2CbUJCO0Th3OivX+tc868G+6vrnM9H+RXoklVlEq3qM8k/nFHmeMuvp4jlP
> 9rIYPbgGUyM+SMXXI1gbhQAAihVHNwEYT0EAAYABgFAAAD5xYGGxxGf7AgAAAAAEWVo=
> EOF
> base64 -d poc.tar.xz.b64 | xz -d > poc.tar

$ tar tvf poc.tar
tar: Archive value -1 is out of off_t range 0..9223372036854775807
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

That's with gnu tar on the host. Note that it doesn't actually _try_ to read,
this check is in the tar header field decode logic, so fiddling with lib/ in
toybox isn't the right place to fix that for toybox tar either.

Meanwhile, toys/*/tar.c is using xsendfile_len() to copy the data from one fd to
another, and that's explicitly "long long" data type. It only uses read and
write to copy header data into and out of toybuf. That said, I need to think
through the users of alloread() when I'm feeling more coherent. (All three users
are reading link target, extended filename, and posix extended record. All of
these should probably be smaller than PATH_MAX to be honest. But allowing a long
long field to be silently truncated to int could potentially cause a corrupted
archive to be misread so the next header would have a checksum failure, and we
want a more graceful error on pathological input.)

But again, none of that is the lib/ code.

Rob


More information about the Toybox mailing list