[Toybox] [PATCH] cpio: fix misaligned header if readlink() returns unexpected value

Rob Landley rob at landley.net
Mon Feb 8 01:32:38 PST 2021


On 2/7/21 10:10 AM, Yi-Yo Chiang via Toybox wrote:
> If file type is symlink and readlink() fails or returns unexpected link
> size, then the file body wouldn't be written, resulting in a misaligned
> archive.

Yup, misaligned archive is definitely a bug. I wonder if I can reproduce it...

> As explained in the comments,

That's a bigger code comment than I'm checking in, FYI. (Large comments go in
commit messages. In code, comments aren't executable and thus don't get
regression tested, so having big ones inline is dubious unless they have a TODO
in them.)

> there are some cases where the link size
> returned by lstat() and readlink() may be different.
> To accommodate this, we readlink() and update the value of st_size
> before writing the file header.

I.E. the lstat() link size is not authoritative, the readlink() size is.

> This may happen on Android if the userdata filesystem is encrypted.
> 
> ---
>  tests/cpio.test   |  2 +-
>  toys/posix/cpio.c | 24 ++++++++++++++++++++----
>  2 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/tests/cpio.test b/tests/cpio.test
> index 11b3a5bf..a12edd09 100755
> --- a/tests/cpio.test
> +++ b/tests/cpio.test
> @@ -39,7 +39,7 @@ rm a bb ccc dddd
>  
>  # archive dangling symlinks and empty files even if we cannot open them
>  touch a; chmod a-rwx a; ln -s a/cant b
> -toyonly testing "archives unreadable empty files" "cpio -o -H newc|cpio -it" "a\nb\n" "" "a\nb\n"
> +toyonly testing "archives unreadable empty files" "cpio -o -H newc|cpio -it" "a\nb\na\n" "" "a\nb\na\n"
>  chmod u+rw a; rm -f a b

What does this change accomplish? The changed test still passes without your
code change, so what did you actually demonstrate?

Ah, this fails on your broken filesystem that's returning inconsistent data. So
really you just need to swap the order of a and b so the symlink has an entry
after it...

> diff --git a/toys/posix/cpio.c b/toys/posix/cpio.c
> index 09c99ae1..200fb365 100644
> --- a/toys/posix/cpio.c
> +++ b/toys/posix/cpio.c
> @@ -230,6 +230,7 @@ void cpio_main(void)
>        unsigned nlen, error = 0, zero = 0;
>        int len, fd = -1;
>        ssize_t llen;
> +      char *lnkbuf = NULL;
>  
>        len = getline(&name, &size, stdin);
>        if (len<1) break;
> @@ -242,6 +243,22 @@ void cpio_main(void)
>          continue;
>        }
>  
> +      // It is possible that readlink() may fail or the actual link size is
> +      // different from that indicated by lstat(). This may be due to a race
> +      // condition (the link is removed/changed between the call to lstat() and
> +      // readlink()), 

In which case it should get caught with the lstat failure above and reported as
an error before we write out a header for it. (Better to avoid writing a bad
entry at all than emit nonsense for padding reasons...)

> or may happen on encrypted filesystem where lstat()
> +      // returns the size of encrypted link and readlink() returns the size of
> +      // decrypted link.

Still really seems like a filesystem bug to me, but we must work around the bugs
that exist. (And it's not as broken as some of the crap reiserfs pulled back in
the day...)

> Thus we first readlink() and update the st_size, bail
> +      // out early if readlink() fails, then write the file header and the link.
> +      if (S_ISLNK(st.st_mode)) {
> +        lnkbuf = xreadlink(name);
> +        if (!lnkbuf) {
> +          perror_msg("readlink '%s'", name);
> +          continue;
> +        }
> +        st.st_size = strlen(lnkbuf);
> +      }
> +
>        if (FLAG(no_preserve_owner)) st.st_uid = st.st_gid = 0;
>        if (!S_ISREG(st.st_mode) && !S_ISLNK(st.st_mode)) st.st_size = 0;
>        if (st.st_size >> 32) perror_msg("skipping >2G file '%s'", name);
> @@ -262,9 +279,7 @@ void cpio_main(void)
>          // Write out body for symlink or regular file
>          llen = st.st_size;
>          if (S_ISLNK(st.st_mode)) {
> -          if (readlink(name, toybuf, sizeof(toybuf)-1) == llen)
> -            xwrite(afd, toybuf, llen);
> -          else perror_msg("readlink '%s'", name);

My original bug was that should have been perror_exit() because the file was
corrupted otherwise. Or the xwrite() should have been unconditional so it didn't
get misaligned. Ala I _should_ have written:

  if (readlink(name, toybuf, sizeof(toybuf)-1) != llen)
    perror_msg("readlink '%s'", name);
  xwrite(afd, toybuf, llen);

But doing the readlink with the open() and noticing failure before writing the
header is the better fix, and then using the link string length to override what
stat() says becomes a one line bug workaround for the new issue you hit...

> +          xwrite(afd, lnkbuf, llen);
>          } else while (llen) {
>            nlen = llen > sizeof(toybuf) ? sizeof(toybuf) : llen;
>            llen -= nlen;
> @@ -276,7 +291,8 @@ void cpio_main(void)
>          llen = st.st_size & 3;
>          if (llen) xwrite(afd, &zero, 4-llen);
>        }
> -      close(fd);
> +      if (lnkbuf) free(lnkbuf);
> +      if (fd != -1) close(fd);

free(0) and close(-1) are both defined as NOPs by posix. I usually test for the
close one because it's an exra syscall and noise in strace, but free(0) is just
a libc call. And in fact, I can just modify xclose() to do the -1 test for me...

(I've gotten REALLY lax about using xclose because as far as I know the only
time close() can EVER fail is on NFS, but I admit "not having to test for -1" is
a reason to use it...)

>      }
>      free(name);

I checked in a slightly different fix. Does that solve your problem?

Thanks,

Rob


More information about the Toybox mailing list