[Toybox] top -H segfault

Sat Jun 4 14:46:11 PDT 2016

On 06/04/2016 11:08 AM, enh wrote:
> On Fri, May 27, 2016 at 12:56 PM, Rob Landley <rob at landley.net> wrote:
>> A thread won't (unless there's a / in a package name, which should never
>> happen), but non-threaded commands might, and could easily overflow the
>> display field before prividing useful info. So I'll stick with the /path
>> trimming I guess.
>>
>> Sorry for the delay, I found another display bug: the recent fiddling
>> with the width logic to add overflow capability meant that when the last
>> field was trimmed to screen width size it was always right justified.
>> The "trim to width" logic didn't propogate the sign of the padding
>> value, negative means left justified. Fixed now.
> 
> sorry, been busy/out but finally got chance to look at this.

I've been a bit distracted recently too, $DAYJOB's is having a Cash Flow
Thing and I was a contractor, so I'm job hunting again.

(When a startup's largest customer applies for patents on the product
the startup is selling them, then when the startup complains stops
paying the startup's invoices for what it assures everyone are
completely unrelated reasons... apparently 9 months of this can add up
to a Cash Flow Thing.)

> TNAME seems broken in a couple of ways. it's empty for kernel threads

Ok, so:

  For normal processes, show argv[] skipping initial path.
  For threads, show thread parent's argv[] -path.
  For _kernel_ threads, show stat[2].

I have no idea how to write the help entry for that. Um, let's see...
The [stat2] behavior should happen for _any_ blank name field in
-o ARGS,CMD,CMDLINE,COMM,COMMAND,NAME,TNAME because leaving it blank
isn't useful. But we should ONLY do that if it _is_ blank otherwise.

Right.

I checked in a thing, but maybe I should do a follow-up changing the
truncation behavior to the "leftside+" thing instead of showing the
right-side data. And maybe take another stab at rephrasing the help text
to explain all this...

> which is a problem because TNAME is now the appropriate replacement for NAME
> here (otherwise on a booted system, everything shows up as
> app_process):
> 
>   else if (CFG_TOYBOX_ON_ANDROID)
>     not_o = "USER,%sPPID,VSIZE,RSS,WCHAN:10,ADDR:10=PC,S,NAME";
> 
> TNAME should also have width -27 like the other "long" name fields,
> rather than -15 like the "short" ones.

I got the -27 thing this pass, I can change the android default later
today along with the truncation and help text changes.

> speaking of which, given that it's 2016 it's time the VSIZE and RSS
> fields were wider too.

Yeah, point. I just bumped 'em up two each. I also note that as numeric
fields they don't get truncated anymore.

> but getting back to TNAME... if we use TNAME rather than NAME by
> default on Android, with the current implementation that implies -H.
> (so you get 84 lines all saying "system_server" instead of 1.)

I didn't understand that part.

> which brings me to: TNAME isn't really "thread name". the way
> everyone's thread library works, COMM is actually the "thread name"
> (that is, the name that [can be] different for each thread in a
> process). TNAME is really "argv[0] of the containing process", which
> is definitely a useful thing, but not something i can easily think of
> a snappy name for.

Me neither.

Right now we have ARGS, CMD, CMDLINE, COMM, COMMAND, NAME, and TNAME,
all showing slightly different info from three sources (stat[2], args[],
and /proc/self/exe)

Posix requires the following default output field:

  CMD The command name; the full command name and its arguments are
      written under the -f option.

Again, 3 possible sources for said command name in modern Linux, and
that's without doing a readlink -f on /proc/self/exe to see if it's a
symlink to a different actual file. :)

Posix also says these -o values should be recognized:

 comm The name of the command being executed ( argv[0] value) as a
      string.

 args The command with all its arguments as a string. The
      implementation may truncate this value to the field width; it is
      implementation-defined whether any further truncation occurs. It
      is unspecified whether the string represented is a version of the
      argument list as it was passed to the command when it started, or
      is a version of the arguments as they may have been modified by
      the application. Applications cannot depend on being able to
      modify their argument list and having that modification be
      reflected in the output of ps.

It also says that -o args and -o comm should have as their default
display header "COMMAND". And yes comm and args are lower case while CMD
and COMMAND are upper case.

So, what does posix tell us:

1) CMD should show just the name, unless we have -f in which case it
should show arguments too. Whether or not we truncate the path isn't
explicitly specified, although "name" vs  "full name" could be without
path vs with path? Maybe? A guess?

2) comm should be just argv[0]. Whether or not we truncate a path isn't
specified.

3) args is all arguments. My initial reading went "CMD said command name
and its arguments so this DOESN'T include command name", but I'm now
assuming this includes the command name itself (since that's argv[0])
and this is just badly phrased.

4) COMMAND is used as a display header. In toybox, if it has a field
label then you can pass that field in as an -o argument to get that data
because users are going to try that anyway, but along the way i
repurposed COMMAND for one of the other meanings because each -o should
have a unique meaning.

So, in toybox, the fields posix mentioned become:

  CMD is the strange jekyl and hyde behavior posix wants with -f

  ARGS is the full command line (argv[] -path)

  COMMAND is /proc/$PID/exe

  COMM is stat[2] (posix says argv[0] but that wasn't reliably there)

I've added CMDLINE, NAME, and TNAME.

  CMDLINE is /proc/$PID/cmdline (I.E. argv[] without stripping path
          from argv[0])

  NAME is COMMAND - path

  TNAME is the new one we've been fiddling with that now does 3 things.

I note that LABEL is used elsewhere (SELinux security label).

Logically, what we SHOULD have is:

  1) /proc/$PID/exe
  2) argv[0]
  3) stat[2]
  4) argv[all]
  5) a way of requesting path truncation on any of the above.
  6) a way to show thread parent's version of any of the above.

I am open to suggestions here. :)

(Cache invalidation, naming things, off by one errors, and a fanatical
devotion to the pope. I'll come in again.)

(Does anyone under 30 actually _get_ monty python references anymore?)

> so the good news -- there is good news! -- is that i think i now have
> all the pieces to replace Android's old top. TNAME gives me what we've
> historically called the "process name"

PNAME?

> and COMM gives me what we've
> called the "thread name". something like this looks right to me
> (modulo the question of whether we should rename TNAME and the fact
> i'll have to manually size the VSZ/RSS fields or they'll always be
> unaligned):

I can expand 'em.

> $ adb shell toybox top -b -n 1 -o
> pid,tid,user,pr,ni,%cpu,s,vsz,rss,pcy,tname,comm -s 6
> Tasks: 556 total,   1 running, 555 sleeping,   0 stopped,   0 zombie
> Mem:   1879752k total,  1250856k used,   628896k free,     2284k buffers
> Swap:   520908k total,        0k used,   520908k free,   538276k cached
> 200%cpu  14%user   0%nice  34%sys 152%idle   0%iow   0%irq   0%sirq   0%host
>   PID   TID USER     PR  NI[%CPU]S    VSZ   RSS PCY TNAME
>          COMM
>  1942  1942 root     20   0 48.0 R   9396  2432  fg toybox
>          toybox

48% cpu eaten by top does not look right. :(

(Although the initial delay period for first pass data collection is 1/4
second so that's actually tying up the processor for 1/8 of a second to
produce that output, which isn't as bad.)

>     8     8 root     20   0  2.0 S      0     0  fg
>          rcu_preempt
>   190  1943 root     20   0  0.0 S  14372  1096  fg adbd
>          shell srvc 40
>  1939  1939 root     20   0  0.0 S      0     0  fg
>          kworker/0:3

You're showing me what it _is_ doing, not what you want. Let me try to
guess what you want.

In this case pid==tid, so... you want TNAME to show argv[0] of the
current process in that case? Maybe?

Something like the attached patch, maybe?

>  1923  1923 root     20   0  0.0 S      0     0  fg
>          kworker/0:1
>  1905  1905 root     20   0  0.0 S      0     0  fg
>          kworker/0:0
>  1893  1893 root     20   0  0.0 S      0     0  fg
>          kworker/u4:1
>  1869  1869 root     20   0  0.0 S      0     0  fg
>          kworker/1:0
>  1838  1838 root     20   0  0.0 S      0     0  fg
>          kworker/u4:2
>  1806  1806 root     20   0  0.0 S      0     0  fg
>          kworker/0:2
>  1674  1674 root     20   0  0.0 S      0     0  fg
>          kworker/1:1
>   606  1665 system   30  10  0.0 S 2239144 128196 bg system_server
>          AsyncTask #3
>   606  1652 system   20   0  0.0 S 2239144 128196 fg system_server
>          SyncHandler-0
>  1127  1648 nfc      30  10  0.0 S 1522608 62988 bg com.android.nfc
>          AsyncTask #3
>  1593  1643 u0_a30   20   0  0.0 S 1516520 64672 fg com.android.email
>          Binder_3
>  1628  1640 u0_a10   20   0  0.0 S 1501988 48620 fg
> com.android.musicfx         Binder_1
>  1628  1641 u0_a10   20   0  0.0 S 1501988 48620 fg
> com.android.musicfx         Binder_2
>  1628  1637 u0_a10   20   0  0.0 S 1501988 48620 bg
> com.android.musicfx         FinalizerDaemon
>  1628  1639 u0_a10   20   0  0.0 S 1501988 48620 bg
> com.android.musicfx         HeapTaskDaemon
>  1628  1638 u0_a10   20   0  0.0 S 1501988 48620 bg
> com.android.musicfx         FinalizerWatchd
>  1628  1636 u0_a10   20   0  0.0 S 1501988 48620 bg
> com.android.musicfx         ReferenceQueueD
>  1628  1633 u0_a10   29   9  0.0 S 1501988 48620 bg
> com.android.musicfx         Jit thread pool
>  1628  1635 u0_a10   20   0  0.0 S 1501988 48620 bg
> com.android.musicfx         JDWP
> 
> but ps has regressed as described above.

Not in enough detail.

> i tried reverting the obvious changes to NAME in
> https://github.com/landley/toybox/commit/bc8139a3a011568727219d3f1375061612e901ed
> (the slot changed from 6 to 5 and the corresponding file changed from
> cmdline to exe), but i must have missed something else because that
> leaves me with empty names for kernel threads.

The problem is you can't reference your parent's "struct carveup" when
displaying a child. (Saving a pointer to it would make the object
lifetime rules NUTS for top.) So every time I need to display a parent
string, I copy it into the child's string data. That's why it grew a new
slot[6], it's a snapshot of a parent field.

When we don't _have_ a parent, I don't want to copy that data (which
slows stuff down and wastes memory). I was leaving it blank, but I can
instead add logic to show_ps() to display an alternate field when _we_
are the parent. (I think I asked what the correct behavior was in a
previous message, but there so much to wade through here...)

The reason the attached patch is so large is there's no current field
displaying a trimmed argv[0]. The get_ps() logic that's harvesting the
parent field (within the context that we still have access to the
parent, which we don't at display time) is trimming the field for us (so
we don't snapshot more data than we have to), but I had to add extra
trimming logic on the display side to display the fallback field in the
same way.

(As always, the hard part is figuring out what the right behavior _is_.
Implementing alternates that MIGHT be what you want usually doesn't take
much.)

> anyway, here's as far as i got...
> 
> diff --git a/toys/posix/ps.c b/toys/posix/ps.c
> index 8aaeb93..755a679 100644
> --- a/toys/posix/ps.c
> +++ b/toys/posix/ps.c
> @@ -347,7 +347,7 @@ struct typography {
> 
>    // String fields
>    {"COMM", -15, -1}, {"TTY", -8, -2}, {"WCHAN", -6, -3}, {"LABEL", -30, -4},
> -  {"NAME", -15, -5}, {"TNAME", -15, -7}, {"COMMAND", -27, -5},
> +  {"NAME", -15, -6}, {"TNAME", -27, -7}, {"COMMAND", -27, -5},

slot[6] doesn't mean the same anymore. I'm not snapshotting the parent's
/proc/self/exe, I'm snapshotting their argv[0] -path.

>    {"CMDLINE", -27, -6}, {"ARGS", -27, -6}, {"CMD", -27, -1},
> 
>    // user/group
> @@ -603,7 +603,7 @@ static int get_ps(struct dirtree *new)
>    } fetch[] = {
>      // sources for carveup->offset[] data
>      {"fd/", _PS_TTY}, {"wchan", _PS_WCHAN}, {"attr/current", _PS_LABEL},
> -    {"exe", _PS_COMMAND|_PS_NAME}, {"cmdline", _PS_CMDLINE|_PS_ARGS|_PS_TNAME},
> +    {"exe", _PS_COMMAND}, {"cmdline", _PS_CMDLINE|_PS_ARGS|_PS_NAME|_PS_TNAME},

The help text says:

 COMMAND  Command name (/proc/$PID/exe)
 NAME     Command name (COMMAND -path)

I.E. PS_NAME is displaying the /proc/self/exe data. That's why the bit
was triggering loading /proc/$$/exe and not/proc/$$/cmdline.

(A lot of the complexity here is trying to keep the memory usage and CPU
time down. Spending complexity budget on optimizations, because reading
/proc and cacheing its contents is expensive in both areas.)

>      {"", _PS_TNAME}
>    };
>    struct carveup *tb = (void *)toybuf;
> @@ -1177,7 +1177,7 @@ void ps_main(void)
>    else if (toys.optflags&FLAG_l)
>      not_o = "F,S,UID,%sPPID,C,PRI,NI,ADDR,SZ,WCHAN,TTY,TIME,CMD";
>    else if (CFG_TOYBOX_ON_ANDROID)
> -    not_o = "USER,%sPPID,VSIZE,RSS,WCHAN:10,ADDR:10=PC,S,NAME";
> +    not_o = "USER,%sPPID,VSIZE:7,RSS:6,WCHAN:10,ADDR:10=PC,S,NAME";

Let's just change the defaults globally and not android-only. Memory
sizes have gotten bigger for everybody...

>    sprintf(toybuf, not_o, (toys.optflags & FLAG_T) ? "PID,TID," : "PID,");
> 
>    // Init TT.fields. This only uses toybuf if TT.ps.o is NULL
> 
> it's not clear to me why the meaning of NAME changed in the TNAME
> patch, so i'm assuming that reverting that (fully, not the
> half-working revert here) makes more sense than moving to (the
> questionably named) TNAME for ps and trying to fix the implied -H?

I wanted to only snapshot _one_ field out of the parent (and then only
when it's used). Before I was snapshotting /proc/self/exe, now I'm
snapshotting /proc/self/cmdline. (Except I'm using the argv0len measured
by get_ps() to snapshot just argv0, not the whole command line.)

In a previous message you said a process can write data to argv[0] after
forking but before spawning threads, and that this was information you
needed to know. So I switched to snapshotting that instead.

If I need to snapshot _both_ fields out of the parent, I can do that.
But that's adding a slot[7], not redefining slot[6] again.

To reiterate: there are 6 sources of "name" data here:

my stat[2]
my argv[]
my /proc/self/exe
parent stat[2]
parent argv[]
parent /proc/self/exe

And that's without getting into argv[0] vs argv[all] and path trimming.
(Which are just different ways of displaying that data.)

Everything always has stat[2], although it may be modified for threads.
argv[] can be overwritten by the process at will. /proc/self/exe is
immutable but only seems to show up for processes (not threads).

What do you _want_ this to display?

>> Also, the ARGS space trimming logic was stopping at the first space
>> (paths can have spaces in them) which is the whole reason I saved
>> argv0len in a slot before replacing low ascii stuff (including NUL
>> bytes) with spaces, and I went back through all the rationale of that
>> confirming I still need it. Reading /proc/$$/exe means I don't need it
>> to reliably display just the command name, but I _do_ need it for
>> reliably displaying all of argv0 (which TNAME does) and for stripping
>> the path from argv0 (which ARGS does).
>>
>> (Yes, along the way I've tested that you can have a command name with
>> ')' in it and the stat[2] parsing won't get confused, that you can have
>> spaces and newlines in command names and the display won't get
>> confused... Really want to get this into the regression testing but as
>> previously mentioned: these commands measure brownian motion and it's
>> easy to get reliably human readable output but really hard to get
>> reproducible machine readable output on different machines. Upgrade the
>> kernel version in your VM and it's butterfly effects all over the place...)
> 
> (this is why unit testing is superior: you'd test the display code
> independently of the data collection code, which in turn would be
> independent of the parsing code.

No, in this case I moved the trimming from one to the other. If the
testing expected the display to trim when I moved the trimming to the
data gathering part, then the test would be what was broken. If code to
trim still exists in one place but the code expects it to happen in
another, the test would be actively misleading.

I know how to do multiple types of testing, but adding layers in my code
for the sole purpose of adding layers just makes everything bigger.

> no input is hard to get at that
> point, and all input is stable and predictable.

I.E. not testing against real data, just confirming that the code
matches the assumptions I made when writing it.

The problem is those assumptions keep changing in something like this.
See the -o NAME nonsense above, I'm picking a semi-random subset of what
I can display trying to minimize my deviation from an obsolete standard
using a command line syntax that doesn't have a way to specify what it
needs to say, and even if it did the parent vs child independent object
lifetime problem (collecting two sets of data and culling them based on
sort criteria that may not retain the parent after the sort) would make
providing all the options simultaneously kind of expensive.

The hard part here is figuring out what the correct behavior _is_.
Implementing it, much less so. It becomes a bit messy when the only way
to figure out what the correct behavior is to implement a guess and
iterate based on "no, I actually need _this_" multiple times. We're on
what, revision 3 of this field so far? We've already changed the design
assumption that processes are independent, and changed the earlier
assumption that blank name fields can always be papered over with
stat[2] because it's always there. The extra tests you want would have
tested those old assumptions, which would not currently be correct, so
the _tests_ would constantly be changing. and how does that help?

> plus you can have
> realistic test data from other OSes at no extra cost.

You think Solaris is going to provide the ~50 /proc/$$/stat fields Linux
does in the same order reliably, and have exe and cmdline and the tty
harvesting stuff the same way this ps implementation expects it?

I can do a container where /proc is extracted from a tarball rather than
a mount point. I can have a series of such directories with the /proc
symlink updated at known intervals.

I can have top write output to a pty the controlling process can read
its data, swap the symlink, and read the resulting data again. (I could
have the pty block but that screws up the timing which is used to
calculate _rates_ that get displayed. Might not if it trusts the timing
data from /proc and just uses the timeout for when to next read, though.
I could give it a timeout of 9999999 and then have it feed in a key
through the pty to force update, I think space currently does that?)

I do have testing plans. They're just mostly post-1.0.

> sadly although
> gtest can be used to test C code, it's C++ itself so the tests
> themselves would have to be C++.)

The code is not the hard part. I rewrote busybox mount more or less from
scratch 3 times while I was there. The sed I did for toybox didn't use
any of the code from my old busybox version.

The hard part is figuring out what the correct behavior should be.
(Which can include correct behavior from the infrastructure so code can
be shared, I.E. the same code performs multiple functions, or more
accurately the same function in multiple properly factored contexts.)

I could write a test suite to do what you describe, but that method of
_testing_ assumes the code is the hard part.

The problems we're having with ps right now is I don't know what you
want it to display in what circumstances. Changing the ps behavior
around has been a question of testing, these are all design questions.

Rob
-------------- next part --------------

diff --git a/toys/posix/ps.c b/toys/posix/ps.c
index 82a8443..879ebfd 100644
--- a/toys/posix/ps.c
+++ b/toys/posix/ps.c
@@ -443,13 +443,24 @@ static char *string_field(struct carveup *tb, struct strawberry *field)
     sl *= -1;
     // First string slot has offset 0, others are offset[-slot-2]
     if (--sl) out += tb->offset[--sl];
+    // If TNAME is blank, show ARGS instead
+    if (which==PS_ARGS && !*out)
+      out = tb->str+tb->offset[-2-typos[which = PS_ARGS].slot];
     if (which==PS_ARGS || which==PS_NAME) {
-      int i;
+      int i, j;
 
       s = out;
       for (i = 0; (which==PS_ARGS) ? i < slot[SLOT_argv0len] : out[i]; i++)
         if (out[i] == '/') s = out+i+1;
+      i = s - out;
       out = s;
+      if (which != field->which) {
+        j = slot[SLOT_argv0len]-i;
+        if (j > 259) j = 259;
+        memcpy(buf, out+i, j);
+        buf[j] = 0;
+        out = buf;
+      }
     }
     if (which>=PS_COMM && !*out) sprintf(out = buf, "[%s]", tb->str);