[Toybox] Fixing find segfault: questions
Tom Marshall
tom at cyngn.com
Tue Apr 26 09:17:40 PDT 2016
On 04/24/2016 11:27 PM, Rob Landley wrote:
> On 04/11/2016 04:02 PM, Tom Marshall wrote:
>> Currently toybox find (toys/posix/find.c) will segfault when the -iname
>> predicate is used on both sides of an -o predicate, the LHS evaluates to
>> true, and the expression followed with the -exec action. This is
>> because the initial parsing pass saves both the iname patterns in
>> lowercase form and the exec arguments on TT.arglist, but subsequent
>> passes do not pop the iname pattern for the RHS. The exec then attempts
>> to use the RHS pattern as the exec arguments.
>>
>> Example, run on Debian 8.x amd64:
>>
>> $ ./toybox find d \( -iname Exists.txt -o -iname foo \) -exec echo {} \;
>> Segmentation fault
> $ ./find d \( -iname Exists.txt -o -iname foo \) -exec echo {} \;
> find: d: No such file or directory
> $ ./find toys \( -iname Exists.txt -o -iname foo \) -exec echo {} \;
> $ ./find toys \( -iname "*.c" -o -iname foo \) -exec echo {} \;
>
> Ah, third one finally reproduced it. Ok, so you mean the first -iname
> actually matches and file, so the -o second test isn't run, and that
> unbalances the array because the corresponding data isn't popped.
>
> And fixed, with test added to test suite. (And two other tests fixed;
> don't test for specific error messages where toybox and other
> implementations vary, I want TEST_HOST to work. And it should work with
> the gnu/dammit implementation or with busybox as the host
> implementation, so the tests need to be reasonably generic.)
>
>> There are several possible strategies to fix this issue. The simplest
>> fix in terms of code change may be simply to always pop the iname
>> pattern regardless of whether it will be checked. However, it would be
>> more efficient to avoid putting the iname pattern in arglist
>> altogether.
> No, it wouldn't?
>
> The arglist is set up once when parsing the command line, and then
> reused for each file we test. Testing each new directory entry traverses
> the same list of command lien data; llist_pop() is not a destructive
> operation, if the caller doesn't free the struct the linked list remains
> intact. (llist_pop() returns a pointer to the next element of the list
> and advances our position pointer, but it doesn't modify the list. The
> position pointer we're feeding it does not live inside the list, and we
> can reset that position pointer to the start of list for the next file
> we need to evaluate.)
>
>> If we are to avoid putting the iname pattern in arglist, I
>> see the two options below. Any preferences which to go with?
>>
>> 1. We could directly convert the argument to lowercase in the initial
>> parsing pass instead of storing it in arglist.
> No, I don't want to write to the environment space, it screws up what ps
> and top sees. (I'm really annoyed that chromium-browser started
> truncating its argument list so I can't reliably kill individual tabs
> anymore because grep for --renderer doesn't have hits and looking for
> the truncated "chro" doesn't distinguish between the plugin handlers and
> tabs, and if I kill a plugin handler the exception takes down the whole
> of chromium. Grrr. Toybox does NOT do that. toys.optargs[] is a new char
> ** array containing a subset of the argv[] tabs, but toys.argv[] is
> still there unmodified.)
>
>> This would require
>> underlying support for modifying argv strings. This works on Debian 8.x
>> amd64 and on Android 6.x, and I don't see any documentation in POSIX
>> that indicates it is disallowed. Are there any known supported platform
>> for which this would not work?
> I think it's in the ELF spec where it describes what gets loaded into
> which segment, but mostly I just care that it works on Linux (which it
> does).
>
> However: dowanna do that because screwing up what ps sees is bad. (Not
> doing that was a design goal of lib/args.c: never permute the inherited
> environment space. Well, don't permute argv[]. envp[] is another can of
> worms, which has apparently gotten a proper fix in libc since
> http://landley.net/notes-2006.html#24-10-2006 or at least musl-libc does
> it right. But that isn't displayed by ps because /proc/self/environ is
> chmod 400 so other UIDs can't read it. :)
>
> Speaking of which, I have a todo item to have toybox mount read an
> OPTIONS environment variable and add it to -o so you can do -o
> password=blah for nfs mounts and such without it showing up in ps. (Do
> "OPTIONS=password=blah mount -t nfs -o blahblahblah...")
>
>> 2. We could leverage the FNM_CASEFOLD flag to fnmatch(3). This is a GNU
>> extension but it is supported on both mainline Linux (glibc) and Android
>> (bionic). It also seems supported on FreeBSD, and probably other
>> *BSDs. Are there any known supported platforms that do not support this
>> flag?
> It was an easy fix. The -iname code block was doing the wrong thing. It
> just took me a while to work this far down my todo list. (Sorry about that.)
>
>
Great, thanks for doing this.
More information about the Toybox
mailing list