[Toybox] ps crashes

enh enh at google.com
Sat Mar 11 11:22:17 PST 2017


On Sat, Mar 11, 2017 at 3:48 AM, Rob Landley <rob at landley.net> wrote:
> Back in Austin, recovering from jetlag...
>
> On 03/09/2017 12:31 PM, enh wrote:
>> i could have sworn i mentioned this already, but i can't find any
>> proof that i did.
>>
>> i haven't seen this crash personally, but automated testing has seen
>> it a few times now:
>>
>>
>> pid: 25863, tid: 25863, name: ps  >>> ps <<<
>> signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x11
>
> Huh, that's not good.
>
>> x0   0000000000000001  x1   000000557cc90468  x2   0000007fd2a2b1c8
>
> Guessing from register size/count this is arm64?
>
>> Stack Trace:
>>   RELADDR           FUNCTION                     FILE:LINE
>>   00000000000353bc  get_threads+164
>
> So +164 would be about... No idea, but presumably not right at the start
> of the function? Which rules out new being funky...
>
>> external/toybox/toys/posix/ps.c:905 (discriminator 1)
>>   000000000000dabc  dirtree_handle_callback+36
>> external/toybox/lib/dirtree.c:112
>>   000000000000dbc8  dirtree_recurse+128
>> external/toybox/lib/dirtree.c:156
>>   000000000000db14  dirtree_handle_callback+124
>> external/toybox/lib/dirtree.c:115
>>   00000000000347a0  ps_main+952
>> external/toybox/toys/posix/ps.c:1235
>>   0000000000013814  toy_exec+92
>> external/toybox/main.c:166 (discriminator 1)
>>   00000000000133d8  toybox_main+48
>> external/toybox/main.c:179 (discriminator 1)
>>   00000000000138dc  main+124                     external/toybox/main.c:237
>>
>> given the 0x11, i'm assuming this is actually 0x10 (aka 16 aka
>> sizeof(void*)/2 aka struct dirtree::child) off DIRTREE_ABORTVAL.
>
> Ok, I'm wrong, sounds like arm64 takes 164 bytes to clear its throat,
> because that sounds like:
>
>   static int get_threads(struct dirtree *new)
>   {
>     struct dirtree *dt;
>     struct carveup *tb;
>     unsigned pid, kcount;
>
>     if (!new->parent) return get_ps(new);
>     pid = atol(new->name);
>
> should be faulting on the very first line (where we have a read
> dereferencing a null-ish pointer).

from the line number addr2line gave above, it's the line marked with !
below. i've included the block between too, since that modifies
new->child.

  // Recurse down into tasks, retaining thread groups.
  // Disable show_process at least until we can calculate tcount
  kcount = TT.kcount;
  sprintf(toybuf, "/proc/%u/task", pid);
  new->child = dirtree_flagread(toybuf, DIRTREE_SHUTUP|DIRTREE_PROC, get_ps);
  TT.threadparent = 0;
  kcount = TT.kcount-kcount+1;
  tb = (void *)new->extra;
  tb->slot[SLOT_tcount] = kcount;

  // Fill out tid and thread count for each entry in group
!  if (new->child) for (dt = new->child->child; dt; dt = dt->next) {

so i assume it's that dirtree_flagread that's failing, returning
DIRTREE_ABORTVAL, and since the code that follows just assumes it's
either valid or null, we die on the line marked ! (as claimed by the
stack trace).

i don't know is why why dirtree_flagread is failing, though, and can't
reproduce it.

>> but i still don't see how we end up with DIRTREE_ABORTVAL here, so i'm
>> not sure what the right fix is.
>
> ABORTVAL is what you return when you want the recursion to stop, it
> means you found what you're looking for. Why it would be passed _into_
> the callback is a question: that shouldn't happen.
>
>> any ideas? (i'm assuming it's fallout from DIRTREE_SHUTUP, but haven't
>> worked out how so yet.)
>
> It sounds like /proc isn't mounted? Comment in lib/ditree.c right above
> dirtree_flagread() says:
>
>   // Returns DIRTREE_ABORTVAL if path didn't exist (use DIRTREE_SHUTUP
>   // to handle error message yourself).
>
> Which is because handle_callback() returns DIRTREE_ABORTVAL if it's
> passed a null pointer for new. And our various calls to
> dirtree_flagread() in ps.c seem to assume NULL as the error return,
> which isn't right.
>
> Except... the only use of get_threads() in ps is line 1237 where we feed
> it as an argument to dirtree_flagread() and that _should_ be getting
> this right. The callback shouldn't get called back with ABORTVAL in the
> first place? It's filtered out in dirtree.c line 157, which is the only
> caller of handle_callback in the file other than dirtree_flagread()...
>
> I need to look at this when I'm more awake. And yeah, that got fiddled
> with in DIRTREE_SHUTUP. Hmmm, how would I reproduce this... I just tried
> a chroot with /proc not mounted (and again with no /proc directory) and
> ps just gave me the header line with no contents, no segfault...
>
> Rob



-- 
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.


More information about the Toybox mailing list