[Toybox] ps down, top to go

Rob Landley rob at landley.net
Fri May 20 14:30:58 PDT 2016


Now that I've dealt with the "ps -A not working after building top" bug,
being sick for a week due to travel, and catching up on the $DAYJOB
backlog thereof...

On 05/09/2016 07:46 PM, enh wrote:
>>>> my real problem is that i don't currently have a field that gives me
>>>> the process name in -T/-H mode.
>>>
>>> Define "process name"?
>>>
>>> There are 6 right now: args, cmd, cmdline, comm, command, and name.
>>>
>>> COMM is stat[2], NAME is argv[0] minus the path, COMMAND is argv[0] with
>>> the path.
>>>
>>> Those are the three variants of "process name", the rest show command
>>> line arguments too: CMDLINE is the full unmodified command line. ARGS is
>>> the full command line using NAME for argv[0] (I.E. minus the path to the
>>> binary you're running, if any). And then CMD is this crazy posix thing
>>> that's one of the others depending on your command line options.
>>
>> compare
>>
>> ./toybox ps -A -T -o pid,tid,comm,command
>>
>> with
>>
>> ps -A -T -o pid,tid,comm,command
> 
> (since it's taken this long and you still don't see what i'm saying, i
> guess i shouldn't assume you're seeing what i'm seeing...
> 
> here's what i see for some random chrome processes with ps:
> 
>  86993  86993 chrome          /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  86997 Chrome_ChildIOT /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  86999 Compositor      /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87000 CompositorTileW /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87001 CompositorTileW /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87002 CompositorTileW /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87003 CompositorTileW /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87004 handle-watcher- /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87005 HTMLParserThrea /opt/google/chrome/chrome --type=renderer --lang=e
>  86993  87008 ScriptStreamerT /opt/google/chrome/chrome --type=renderer --lang=e
>  86993 128020 WorkerPool/5655 /opt/google/chrome/chrome --type=renderer --lang=e
> 
> and then toybox (ignoring that toybox mangled the large tid for the
> last thread):
> 
> 86993 86993 chrome          /opt/google/chrome/chrome (deleted)
> 86993 86997 Chrome_ChildIOT [Chrome_ChildIOT]
> 86993 86999 Compositor      [Compositor]
> 86993 87000 CompositorTileW [CompositorTileW]
> 86993 87001 CompositorTileW [CompositorTileW]
> 86993 87002 CompositorTileW [CompositorTileW]
> 86993 87003 CompositorTileW [CompositorTileW]
> 86993 87004 handle-watcher- [handle-watcher-]
> 86993 87005 HTMLParserThrea [HTMLParserThrea]
> 86993 87008 ScriptStreamerT [ScriptStreamerT]
> 86993 12802 WorkerPool/5655 [WorkerPool/5655]
> 
> )

Threads have nothing in /proc/$$/cmdline, so command and friends have
nothing to show and fall back to showing kernel thread, yes.

It sounds like the behavior you _want_ is for one of them to show the
$PID command line for this $TID? I.E. show some OTHER process's command
line because threads have magic relationships that ps needs to learn
about. Most likely this should be CMDLINE doing it.

Which is a _bit_ of a problem because the display code only has access
to a single process, it can't reach out and grab another process. While
I can stick a pointer in a slot[], the way the top and iotop logic
shuffle stacks of processes together could screw up the lifetime rules
there and traverse a stale pointer if I did that. (Can of worms, dowanna
go there.)

Hmmm. It's a layering violation: code that looks at an array of
processes calls code that looks at single processes, and the code that
looks at single processes hasn't got any way to get back to that array.

In fact it's worse than that, get_ps() populates toybuf and then _if_
we're doing fancy sorting things will memcpy() the data out into a
malloc. But if we're not, it just displays the toybuf data and frees it.
So the parent node data no longer _exists_ by the time we're displaying
the threads.

So, what we gotta do is snapshot the data into toybuf. I can add another
entry to the fetch[] array at the start of get_ps() and have that be a
zero length string for non-threads but a copy of the parent process's
command line for threads, and then have CMDLINE print that if it's
non-null, otherwise fall back to previous behavior. Actually I can be
slimy about initializing struct carveup offset[6] so it only points to
the new entry if there is one, and points to

The question is, which -o fields should show slot -7 and which should
show slot -1? I.E. when do I do the current [thread] behavior, and when
do I lie and show the parent's command line instead? (I could add a ps
--lie-about-threads option pretty easily that just replaces it for
everybody, ala set slot -1 to the cached parent data. I just dunno what
the correct behavior should be here. I got threads out of my system back
under OS/2, I really haven't dealt with them much in a posix-ish
context. Mostly because pthreads were abominable and inexplicably tied
event semaphores to variable tests for no obvious reason. "Go wake up
this thread" is a _primitive_, darn it...)

>>>>> (Did you know "top -O" in ubuntu lists all the available field names? I
>>>>
>>>> i only found that out while experimenting recently. i'd assumed it
>>>> worked like ps' much more useful -O.

I implemented ps style -O for top a few days ago. If there isn't a spec,
then "what ps does" is as valid as anything...

This one swaps out PR,NI,VIRT,RES,SHR,S for what you supply in -O.

>>> The problem is ps's default output has buckets of free space and top's
>>> doesn't, so if -O inserts fields it pushes stuff off the right edge
>>> pretty quickly.
>>
>> (remember i only care about this for batch mode, for inclusion in a
>> bug report. so columns are basically unlimited. i think it's
>> reasonable to argue that using top -O and expecting to fit in 80
>> columns is clearly unreasonable, and you should use -o to choose for
>> yourself how to divide up the space.)

No longer a problem either way. :)

(Oh, the other thing the new top -O does is move the default sort to the
first -O field, instead of the CPU field. Because that's what seemed
useful.)

>>>> even though you hate them, this is one of the nice things about long
>>>> options. they're easier to remember, and no one cares that you've
>>>> already taken --list-fields because they're not likely to want
>>>> --list-fields to mean anything else.
>>>
>>> I just like there to be a short option corresponding to each long option.
>>
>> many long options just aren't worth a short option.

If they aren't worth a short option, are they worth _having_?

>> there are 26*2
>> available short options, and everyone's better off if they're at least
>> somewhat mnemonic but the most important thing is that you can type
>> them quickly because you use them all the time.

Short options also group in a way long options don't.

I dunno about mnemonics, but the "lord of the rings" option to ls:

  ls -lotr

Is pretty nifty for finding the most recently modified file(s) in a
directory.

>> whereas for
>> rarely-used long options it's better if you don't waste a precious
>> short option,

Agreed.

>> and you're more likely to remember the descriptive
>> option. (long options work really well for --something/--no-something
>> pairs too.)

"The hardest part of design is keeping features out." - Don Norman.

I lean towards "if it's not worth a short option, why is it worth _doing_?"

There are exceptions for things like ls --color which only ever get used
via "alias" set in the shell profile, but they _are_ excpetions.

>>>>> (And I gotta finish ioctl...)
>>>>
>>>> that seems too broken for me to believe anyone's actually been using
>>>> it. but then one might equally well say that about the kernel's ioctl
>>>> interface and it's sadly not dead yet.
>>>
>>> I did half a replacement once and I should finish it. Alas, there's a
>>> dozen things I could say that about and the past few days I've been
>>> wrestling with j-core repository conversion.
>>>
>>> (And if there's going to be a "sysctl" command, there might as well be
>>> an ioctl command...)
>>>
>>> Sigh, I had this message half-finished for a few days and looking back
>>> I'm going "Oh right, I forgot I was in the middle of that" about 3
>>> different things. I suspect I should pull up the mailing list threads
>>> for the month to re-read on the plane...

And if I hadn't flown United, that might have been a useful thing to do.

(They're cheap for a reason.)

Rob


More information about the Toybox mailing list