[Toybox] [PATCH] cpio: fix misaligned header if readlink() returns unexpected value

Yi-yo Chiang yochiang at google.com
Fri Feb 12 23:35:35 PST 2021


Just found out this block of text from fscrypt manual (
https://elixir.bootlin.com/linux/v5.11-rc7/source/Documentation/filesystems/fscrypt.rst#L1066
):

>  The st_size of an encrypted symlink will not necessarily give the
>  length of the symlink target as required by POSIX.  It will actually
>  give the length of the ciphertext, which will be slightly longer
>  than the plaintext due to NUL-padding and an extra 2-byte overhead.

Looks like this is intentional :(

On Tue, Feb 9, 2021 at 2:44 AM enh <enh at google.com> wrote:

>
>
> On Mon, Feb 8, 2021 at 2:06 AM Yi-yo Chiang via Toybox <
> toybox at lists.landley.net> wrote:
>
>>
>>
>> On Mon, Feb 8, 2021 at 5:19 PM Rob Landley <rob at landley.net> wrote:
>>
>>> On 2/7/21 10:10 AM, Yi-Yo Chiang via Toybox wrote:
>>> > If file type is symlink and readlink() fails or returns unexpected link
>>> > size, then the file body wouldn't be written, resulting in a misaligned
>>> > archive.
>>>
>>> Yup, misaligned archive is definitely a bug. I wonder if I can reproduce
>>> it...
>>>
>>> > As explained in the comments,
>>>
>>> That's a bigger code comment than I'm checking in, FYI. (Large comments
>>> go in
>>> commit messages. In code, comments aren't executable and thus don't get
>>> regression tested, so having big ones inline is dubious unless they have
>>> a TODO
>>> in them.)
>>>
>>> > there are some cases where the link size
>>> > returned by lstat() and readlink() may be different.
>>> > To accommodate this, we readlink() and update the value of st_size
>>> > before writing the file header.
>>>
>>> I.E. the lstat() link size is not authoritative, the readlink() size is.
>>>
>>> > This may happen on Android if the userdata filesystem is encrypted.
>>> >
>>> > ---
>>> >  tests/cpio.test   |  2 +-
>>> >  toys/posix/cpio.c | 24 ++++++++++++++++++++----
>>> >  2 files changed, 21 insertions(+), 5 deletions(-)
>>> >
>>> > diff --git a/tests/cpio.test b/tests/cpio.test
>>> > index 11b3a5bf..a12edd09 100755
>>> > --- a/tests/cpio.test
>>> > +++ b/tests/cpio.test
>>> > @@ -39,7 +39,7 @@ rm a bb ccc dddd
>>> >
>>> >  # archive dangling symlinks and empty files even if we cannot open
>>> them
>>> >  touch a; chmod a-rwx a; ln -s a/cant b
>>> > -toyonly testing "archives unreadable empty files" "cpio -o -H
>>> newc|cpio -it" "a\nb\n" "" "a\nb\n"
>>> > +toyonly testing "archives unreadable empty files" "cpio -o -H
>>> newc|cpio -it" "a\nb\na\n" "" "a\nb\na\n"
>>> >  chmod u+rw a; rm -f a b
>>>
>>> What does this change accomplish? The changed test still passes without
>>> your
>>> code change, so what did you actually demonstrate?
>>>
>>> Ah, this fails on your broken filesystem that's returning inconsistent
>>> data. So
>>> really you just need to swap the order of a and b so the symlink has an
>>> entry
>>> after it...
>>>
>>> > diff --git a/toys/posix/cpio.c b/toys/posix/cpio.c
>>> > index 09c99ae1..200fb365 100644
>>> > --- a/toys/posix/cpio.c
>>> > +++ b/toys/posix/cpio.c
>>> > @@ -230,6 +230,7 @@ void cpio_main(void)
>>> >        unsigned nlen, error = 0, zero = 0;
>>> >        int len, fd = -1;
>>> >        ssize_t llen;
>>> > +      char *lnkbuf = NULL;
>>> >
>>> >        len = getline(&name, &size, stdin);
>>> >        if (len<1) break;
>>> > @@ -242,6 +243,22 @@ void cpio_main(void)
>>> >          continue;
>>> >        }
>>> >
>>> > +      // It is possible that readlink() may fail or the actual link
>>> size is
>>> > +      // different from that indicated by lstat(). This may be due to
>>> a race
>>> > +      // condition (the link is removed/changed between the call to
>>> lstat() and
>>> > +      // readlink()),
>>>
>>> In which case it should get caught with the lstat failure above and
>>> reported as
>>> an error before we write out a header for it. (Better to avoid writing a
>>> bad
>>> entry at all than emit nonsense for padding reasons...)
>>>
>>> > or may happen on encrypted filesystem where lstat()
>>> > +      // returns the size of encrypted link and readlink() returns
>>> the size of
>>> > +      // decrypted link.
>>>
>>> Still really seems like a filesystem bug to me, but we must work around
>>> the bugs
>>> that exist. (And it's not as broken as some of the crap reiserfs pulled
>>> back in
>>> the day...)
>>
>>
>> Yeah I think this is a filesystem or kernel bug (fscrypt?) and the real
>> good fix is for the kernel to report consistent lstat() results, but we
>> have to remedy for existing devices out there that's running with this
>> buggy lstat() :/
>> I've raised a question regarding this issue to our kernel team.
>>
>
> i'll be curious to hear whether it's actually a bug, or a
> deliberate obfuscation: "no, i'm not even going to tell you the _length_ of
> this encrypted information because i worry you might be able to make
> valid inferences from that".
> https://en.wikipedia.org/wiki/Padding_(cryptography)#Traffic_analysis_and_protection_via_padding
> kind of thing.
>
>
>> > Thus we first readlink() and update the st_size, bail
>>> > +      // out early if readlink() fails, then write the file header
>>> and the link.
>>> > +      if (S_ISLNK(st.st_mode)) {
>>> > +        lnkbuf = xreadlink(name);
>>> > +        if (!lnkbuf) {
>>> > +          perror_msg("readlink '%s'", name);
>>> > +          continue;
>>> > +        }
>>> > +        st.st_size = strlen(lnkbuf);
>>> > +      }
>>> > +
>>> >        if (FLAG(no_preserve_owner)) st.st_uid = st.st_gid = 0;
>>> >        if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) st.st_size =
>>> 0;
>>> >        if (st.st_size >> 32) perror_msg("skipping >2G file '%s'",
>>> name);
>>> > @@ -262,9 +279,7 @@ void cpio_main(void)
>>> >          // Write out body for symlink or regular file
>>> >          llen = st.st_size;
>>> >          if (S_ISLNK(st.st_mode)) {
>>> > -          if (readlink(name, toybuf, sizeof(toybuf)-1) == llen)
>>> > -            xwrite(afd, toybuf, llen);
>>> > -          else perror_msg("readlink '%s'", name);
>>>
>>> My original bug was that should have been perror_exit() because the file
>>> was
>>> corrupted otherwise. Or the xwrite() should have been unconditional so
>>> it didn't
>>> get misaligned. Ala I _should_ have written:
>>>
>>>   if (readlink(name, toybuf, sizeof(toybuf)-1) != llen)
>>>     perror_msg("readlink '%s'", name);
>>>   xwrite(afd, toybuf, llen);
>>>
>>> But doing the readlink with the open() and noticing failure before
>>> writing the
>>> header is the better fix, and then using the link string length to
>>> override what
>>> stat() says becomes a one line bug workaround for the new issue you
>>> hit...
>>>
>>> > +          xwrite(afd, lnkbuf, llen);
>>> >          } else while (llen) {
>>> >            nlen = llen > sizeof(toybuf) ? sizeof(toybuf) : llen;
>>> >            llen -= nlen;
>>> > @@ -276,7 +291,8 @@ void cpio_main(void)
>>> >          llen = st.st_size & 3;
>>> >          if (llen) xwrite(afd, &zero, 4-llen);
>>> >        }
>>> > -      close(fd);
>>> > +      if (lnkbuf) free(lnkbuf);
>>> > +      if (fd != -1) close(fd);
>>>
>>> free(0) and close(-1) are both defined as NOPs by posix.
>>
>>
>> Good to know this!.
>>
>>
>>> I usually test for the
>>> close one because it's an exra syscall and noise in strace, but free(0)
>>> is just
>>> a libc call. And in fact, I can just modify xclose() to do the -1 test
>>> for me...
>>>
>>> (I've gotten REALLY lax about using xclose because as far as I know the
>>> only
>>> time close() can EVER fail is on NFS, but I admit "not having to test
>>> for -1" is
>>> a reason to use it...)
>>>
>>> >      }
>>> >      free(name);
>>>
>>> I checked in a slightly different fix. Does that solve your problem?
>>>
>>
>> Yup, verified on my android test bench.
>>
>>
>>>
>>> Thanks,
>>>
>>> Rob
>>>
>>
>>
>> --
>>
>> Yi-yo Chiang
>> Software Engineer
>> yochiang at google.com
>> _______________________________________________
>> Toybox mailing list
>> Toybox at lists.landley.net
>> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>>
>

-- 

Yi-yo Chiang
Software Engineer
yochiang at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20210213/54cbf2b6/attachment.html>


More information about the Toybox mailing list