[Toybox] The cut -C test is failing because bionic's wcwidth() doesn't match glibc or musl.

Rob Landley rob at landley.net
Mon Oct 25 20:35:01 PDT 2021


On 10/25/21 4:53 PM, enh wrote:
> On Thu, Oct 21, 2021 at 12:18 AM Rob Landley <rob at landley.net
>     On 10/20/21 4:35 PM, enh wrote:
>     > if you don't mind creating a /system directory (or symlink) in /, you can just
>     > run like that. there's a script in AOSP bionic that sets up all the links you
>     > need to run the bionic tests on the host.
> 
>     [Removed flailing trying to find the script, you said where it is later.]
> 
>     Does it fix the OTHER warnings about being unable to "__bionic_open_tzdata:
>     couldn't find any tzdata when looking for utc!" and such? I still need to fix
>     the "FAIL: find -type f -user -exec" because bionic can't read my $USER out of
>     /etc/passwd...
> 
>     Ahem:
> 
>     env: exec env: No such file or directory
>     PASS: env -i
> 
>     Right, you disabled builtins, 
> 
> yeah, to try to reduce mksh/toybox confusion. if/when we switch /bin/sh to
> toybox it should reduce compat issues too.

Doing so uncovered a legitimate bug (which I've been too buried to address but
it's on the todo list) and made me think I might need to run the test suite with
every combination of CFG_TOYBOX_* toggled, which is exponentially increasing on
the number of options. :(

>     and there's no $PATH. I should $(which env) there.
>     And maybe add an || echo fail
> 
>     (And now having done the $(which) annd checked it in... "env -i env" on the host
>     does NOT have this failure. The -i does not apply to env's own exec. Sequencing
>     I need to fiddle with...)
> 
>     All the other failures seem to be the unicode plumbing.
> 
>     >     (And when I _tried_ to create a chroot as a normal user, bionic segfaulted
>     >     before calling main() because /dev/null wasn't there. Did that ever
>     get fixed?
>     >     Hard to run an init linked against bionic when that's the case.
>     >
>     > have you tried? i'm pretty sure there's a special case for pid 1, for exactly
>     > this reason :-)
> 
>     I'm going to give you a moment to think about what you did.
> 
> yeah, it's one of those things that's hard to reason about ---

I was wincing that you'd added a special case to work around a bug, and in the
non-special case used a segfault as a security mechanism. (Although rereading
bionic/libc_init_common.cpp it's more explicit/intentional about it all than I
remembered.)

> "does my
> tiger-repellant bracelet actually work, or are there just no tigers where i
> live?".

As Alice asked the White Knight in "Through the Looking Glass".

> specifically "has Android never had a stdio confusion CVE because of
> this, or because the world had already moved on from those security bugs [which
> were all the rage when Android started], or just because we don't have anything
> like inetd?".

Or because you're didn't have an NDK for the first several years and third party
apps are almost entirely GUI programs that don't really interact with a text
console?

What stops code from losing track of filehandles and calling close(1) and then
opening a new stdout a moment later by accident? I remember segfaulting one of
the gnu tools feeding it "-" as an argument more than once (it was a "what
happens if" test implementing the toybox version and the answer, as is so often
the case, is "nobody in gnu-land ever thought of that or tried it").

This is why toybox has notstdio() and xopen_stdio() and such in lib/xwrap.c to
handle this sort of thing itself on _every_ open. (xopen_stdio() being the
version that DOESN'T dup() any filehandle <3 and backfill with /dev/null until
it's returning fd 3 or greater.)

> hard to untangle those, and since security -- like law -- is an
> ass-covering business, no-one wants to be the one to _remove_ anything, even if
> it's not obvious that it's really helping. (and in this case it's not obvious
> that it's _not_ helping, so...)

The specific problem I had was:

  $ x86_64-linux-android-cc --static hello2.c
  $ mkdir ch
  $ mv a.out ch
  $ sudo chroot ch ./a.out
  Segmentation fault

The new bionic process _had_ an existing stdin/stdout/stderr, and yet segfaulted
because the startup code tried to open("/dev/null") before it even looked at the
three filehandles. The sanitizer had nothing to do and did it wrong. (Failed
doing something it would then immediately undo without using.)

If instead the whole fixup function was just:

  for (i = 0; i<3; i++)
    if (-1 == fcntl(i, F_GETFL) && errno == EBADF) open("/dev/null", O_RDWR));

Then:

A) Posix says open() returns lowest fd available so it has to backfill 0/1/2, no
dup() required.

2) Defers the open() until there's an actual problem. Yes it calls open more
than once if it has more than one thing to do (instead of dup) but it's one call
made in a loop, and doesn't happen at all in the common case.

III) I didn't bother checking for error because inability to open "/dev/null" is
a "should never happen" in a function already attempting to correct for a
"should never happen", which then may or may not cause an actual problem later
on if it hadn't been corrected. Six sigma meets a billion unit volume means you
hit 3400 failures so I understand the paranoia, but this is two levels of pilot
error to reach a maybe. However if you really want to error check the deferred
open, doing so wouldn't breaking a chroot that DOES have stdin/out/err.

*) If you DO error you'd still need to special case PID 1 to support the
initramfs with no /dev/console case, since that runs with no filehandles. (But
personally that leans me towards not erroring; Your <strike>Mileage</strike>
Scar Tissue May Vary.)

> /me wonders whether i could at least replace the special case with a fallback
> that uses open() with O_TMPFILE?

Problem: open("/", O_RDWR|O_TMPFILE|O_EXCL) needs write access to "/" even if it
isn't creating a dentry, because the new inode uses that filesystem as backing
store to flush pages to on memory pressure. (And if you do write 100 megabytes
to it it'll try to store it instead of discarding it. (O_TMPFILE is mmapable,
even with O_EXCL I think?)) Expecting a writeable directory like /dev or /tmp to
exist is the same problem as expecting /dev/null to exist: chroot doesn't
guarantee any specific filesystem layout.

When I need a dummy filehandle without a filesystem I call pipe() and close the
other end. (Writes would error but that happens with shell pipelines all the
time...)

That said, the /dev/null approach would probably be fine if the open was
deferred until it was actually shown to be needed.

>     > because the last thing the world needs is _multiple_ copies of all the i18n
>     > data. icu is the source of truth for that on Android. (we've had bugs in the
>     > past in this area that made us clean up our act.)
> 
>     It's the dlopen() that hurts. I'd ask why there's no libicu.a, but the r21d ndk
>     doesn't seem to have a libicu.so either. I should grab a newer version...
> 
> icu4c in apps is ... complicated. but you're never getting a *static* icu4c for
> the same "the icu data format and schema aren't stable" reasons i've mentioned
> before.

So utf8 won't ever work in static bionic the same way DNS queries won't work in
static glibc (because that needs to dlopen libnss).

>     Static linking is a legitimate use case. Android does not particularly treat it
>     as such, but I'm going to keep periodically tapping the sign.
> 
> it's a legitimate use case _if you're prepared to carry everything around
> yourself_. and that is supported --- feel free to build your own icu4c and ship
> your own 30MiB data file :-)

*shrug* Musl seems to do it well enough for me. I admit I'm not a polyglot.

> (we did finally get a stable supported subset of the icu4c API for apps into the
> NDK for S aka 12, but that doesn't help your use case.)

I need to make the /system symlink you suggested work. Possibly in a mkroot
image under qemu. Unfortunately it was fiddly enough I didn't get through it
over the weekend and it went back on the todo list.

>     I was hoping that after the aosp build there'd be a system directory that I
>     could ln -s /system to and it would just work. Instead:
> 
>       # source this script in bash
> 
>       source ${ANDROID_BUILD_TOP}/build/envsetup.sh
> 
>     So we have to do the envsetup in order for $ANDROID_BUILD_TOP to be set, and
>     then this script does it _again_. Right.
> 
> (fwiw, i think that's for the buildbots.)
>  
>     Before I run something that makes this many assumptions as root so it can make
>     presumably persistent changes to my host system, I think I'm going to try to
>     read it a little more closely. Thanks for the pointer.
> 
> oh, yeah, to be clear: i was suggesting you boil this down to the three or four
> lines you actually need (since aiui part of your goal is to _not_ have to have a
> complete AOSP checkout that's up to date and built).

I like having an up to date AOSP checkout, but updating and rebuilding is
nontrivial for me.

>     > that should set you up with everything you need (and both 32- and 64-bit
>     too, if
>     > you're chasing something 32-bit-only).
> 
>     It declares a shell function, and then doesn't call it. What it DOES do is eval
>     a soong invocation, and says to read envsetup.sh (which is just under 2000 lines
>     long) to figure out what --dumpvars-mode does (that env and declare -p
>     presumably don't).
>
> yeah, you just need the `ln -s`s.

The ones under:

    if [ ${TARGET_ARCH} = x86 -o ${TARGET_ARCH} = x86_64 ]; then

Which apparently don't have a corresponding arm version?

> if you follow the history far back enough, it
> used to basically just be that, but things got a lot more complicated to support
> all the different kinds of builds, and the multiple supported build environments
> (bots versus humans in particular).

I know the feeling.

I spent a chunk of the weekend staring at scripts/mkroot.sh trying to figure out
how to simplify it. I'm unhappy with all the setup plumbing before "Create new
root filesystem's directory layout", it's too complicated. (And I'm trying to
add record-commands support to it, which doesn't help.)

> if you just manually do the `ln -s` for /system/bin and /system/lib64 (and
> /system/lib if you care about 32-bit), you're most of the way there, and i think
> the error messages you get when you run will point you at the other bits (tz data?).

Might take a stab this weekend, but right now I'm hip deep in yocto plumbing.
(Which is the opposite of fun but pays the bills.)

Rob



More information about the Toybox mailing list