[Toybox] Fixing find segfault: questions

Rob Landley rob at landley.net
Sun Apr 24 23:27:09 PDT 2016


On 04/11/2016 04:02 PM, Tom Marshall wrote:
> Currently toybox find (toys/posix/find.c) will segfault when the -iname
> predicate is used on both sides of an -o predicate, the LHS evaluates to
> true, and the expression followed with the -exec action.  This is
> because the initial parsing pass saves both the iname patterns in
> lowercase form and the exec arguments on TT.arglist, but subsequent
> passes do not pop the iname pattern for the RHS.  The exec then attempts
> to use the RHS pattern as the exec arguments.
> 
> Example, run on Debian 8.x amd64:
> 
> $ ./toybox find d \( -iname Exists.txt -o -iname foo \) -exec echo {} \;
> Segmentation fault

$ ./find d \( -iname Exists.txt -o -iname foo \) -exec echo {} \;
find: d: No such file or directory
$ ./find toys \( -iname Exists.txt -o -iname foo \) -exec echo {} \;
$ ./find toys \( -iname "*.c" -o -iname foo \) -exec echo {} \;

Ah, third one finally reproduced it. Ok, so you mean the first -iname
actually matches and file, so the -o second test isn't run, and that
unbalances the array because the corresponding data isn't popped.

And fixed, with test added to test suite. (And two other tests fixed;
don't test for specific error messages where toybox and other
implementations vary, I want TEST_HOST to work. And it should work with
the gnu/dammit implementation or with busybox as the host
implementation, so the tests need to be reasonably generic.)

> There are several possible strategies to fix this issue.  The simplest
> fix in terms of code change may be simply to always pop the iname
> pattern regardless of whether it will be checked.  However, it would be
> more efficient to avoid putting the iname pattern in arglist
> altogether.

No, it wouldn't?

The arglist is set up once when parsing the command line, and then
reused for each file we test. Testing each new directory entry traverses
the same list of command lien data; llist_pop() is not a destructive
operation, if the caller doesn't free the struct the linked list remains
intact. (llist_pop() returns a pointer to the next element of the list
and advances our position pointer, but it doesn't modify the list. The
position pointer we're feeding it does not live inside the list, and we
can reset that position pointer to the start of list for the next file
we need to evaluate.)

> If we are to avoid putting the iname pattern in arglist, I
> see the two options below.  Any preferences which to go with?
> 
> 1. We could directly convert the argument to lowercase in the initial
> parsing pass instead of storing it in arglist.

No, I don't want to write to the environment space, it screws up what ps
and top sees. (I'm really annoyed that chromium-browser started
truncating its argument list so I can't reliably kill individual tabs
anymore because grep for --renderer doesn't have hits and looking for
the truncated "chro" doesn't distinguish between the plugin handlers and
tabs, and if I kill a plugin handler the exception takes down the whole
of chromium. Grrr. Toybox does NOT do that. toys.optargs[] is a new char
** array containing a subset of the argv[] tabs, but toys.argv[] is
still there unmodified.)

> This would require
> underlying support for modifying argv strings.  This works on Debian 8.x
> amd64 and on Android 6.x, and I don't see any documentation in POSIX
> that indicates it is disallowed.  Are there any known supported platform
> for which this would not work?

I think it's in the ELF spec where it describes what gets loaded into
which segment, but mostly I just care that it works on Linux (which it
does).

However: dowanna do that because screwing up what ps sees is bad. (Not
doing that was a design goal of lib/args.c: never permute the inherited
environment space. Well, don't permute argv[]. envp[] is another can of
worms, which has apparently gotten a proper fix in libc since
http://landley.net/notes-2006.html#24-10-2006 or at least musl-libc does
it right. But that isn't displayed by ps because /proc/self/environ is
chmod 400 so other UIDs can't read it. :)

Speaking of which, I have a todo item to have toybox mount read an
OPTIONS environment variable and add it to -o so you can do -o
password=blah for nfs mounts and such without it showing up in ps. (Do
"OPTIONS=password=blah mount -t nfs -o blahblahblah...")

> 2. We could leverage the FNM_CASEFOLD flag to fnmatch(3).  This is a GNU
> extension but it is supported on both mainline Linux (glibc) and Android
> (bionic).  It also seems supported on FreeBSD, and probably other
> *BSDs.  Are there any known supported platforms that do not support this
> flag?

It was an easy fix. The -iname code block was doing the wrong thing. It
just took me a while to work this far down my todo list. (Sorry about that.)

Rob



More information about the Toybox mailing list