[Toybox] How toybox PS works.

Wed Jan 30 15:58:32 PST 2019

On 1/30/19 4:19 PM, Ed Maste wrote:
> FWIW I asked on twitter what args FreeBSD people use with ps:
> https://twitter.com/ed_maste/status/1090698703109804034

Thanks. The gnu/dammit ps implementation documents all sorts of "H is show
threads, -H is hiearchy/treeview" differences in the two option behaviors. I
haven't gone through the man page and made a list yet. Posix only mentions the
dash ones, and I implemented that first...

The bigger problem for BSD/macos is ps.c parses /proc/$PID/stat to get its
process information, and I don't think BSD even has that concept? No idea where
BSD or MacOS versions of ps get this sort of data. (Bell Labs unix read /dev/mem
and parsed the in-kernel process table structures. No really! The kernel tracked
processes using a fixed size array of process structures that the pid was just
an index into, so it wasn't _that_ hard to do if you know where in memory it
started and the races were manageable, but... not the modern method anywhere, I
expect.)

--- How toybox ps works

Toybox ps supplements the stat data from /proc with other sources (mostly also
from /proc) to populate a struct procpid instance for each process. That
structure starts with a long long slot[] array, more than half of those values
are just the stat values in order. What goes at each index position in the array
is briefly briefly described with the SLOT_ enums, see ps.c line 229 on) and
most of the rest of the space is up to 7 strings stored one after the other null
terminated. (The "state" field is a latter, it's the "-o S" field.) This data is
assembled into a 4k buffer, so all the strings together can add up to at most
4k, and values will be truncated otherwise. (Each string has 256 bytes reserved
for it, but can overflow into the remaining space on a first come first served
basis.)

The slot[] values are indexed by SLOT_ macros (the big enum) and then there's a
big array "struct typography typos[]" (line 313 on) which provides default
display information for each one (which header name goes with which slot which
is also the -o name used to select it, whether it's left or right justified and
how much to pad it to by default, what help text to show for this field...) but
the actual display is done by a function show_ps() which calls string_field() to
get a string representation of a field's data, and string_field() has buckets of
if/else special case behavior for maybe 20% of the fields (scaling and units and
so on).

The display is controlled by a linked list of "struct ofield" which is the list
of fields to display, the lib/ function comma_args() breaks down comma separated
strings and passes each member to a callback (such as parse_ko()) to handle it,
and that assembles the list from either he command line options like -k and -o
or from default strings (line 1302 on).

So main() parses the command line arguments to populate the struct ofield *
list, then dirtree("/proc") calls get_ps() on each directory entry, which reads
system data to fill out struct procpid *tb which is passed through filters (like
ps_match_process()) to work out what to display, and the survivors are sent to
show_ps() which uses struct ofield * list to figure out what fields to examine
and calls string_field() on each to write a string into a buffer, then trims and
pads and outputs them.

If you're not sorting the output each struct procpid gets handled right after
it's read, and stays in toybuf. If you _are_ sorting the 4k buffer copied to a
fresh  malloc() and appended to a linked list, which gets converted to an array
and sorted after all the processes are read, and then iterated over and
show_ps() called on each one.

Top reuses about 90% of this, except it reads _two_ arrays of struct procpid
with a known time gap between them, and diffs them. It stores the diff results
in the "old" procpid slot[] array and sorts/displays "old", and then next cycle
frees the "old" copy and makes the "new" one the "old", then reads another
"new", etc.

--- changing this or BSD

Most of this shouldn't need to change for BSD, the process data is read from he
system by get_ps() and in _theory_ BSD would just need an alternate
implementation of get_ps().

In practice, the dirtree infrastructure that finds the processes (lib/dirtree.c)
is recursive directory descent plumbing populating a tree of struct dirtree *
structures for each entry and calling a callback function to filter and/or act
upon them. Which means A) it assumes it's reading a filesystem to populate, B)
our get_ps() is a callback function that knows about the /proc layout. There's
also a get_threads() shim that descends an extra level for Linux's weird thread
layout (and then calls get_ps() to do the reading once it's identified the next
directory with something interesting).

The OTHER fun thing is that get_ps() uses the struct ofields list to know what
data to bother collecting. It's an optimization that avoids reading data we'd
just throw out. (String parsing is expensive!) Entire categories of things like
the iotop data don't get read in unless we're in the right mode. What happens is
the comma_args() callbacks that populate struct ofields also set a bit array in
TT.bits indicating which fields are in use, see line 302 for a comment on that.
The TAGGED_ARRAY plumbing is compile-time header generation, scripts/make.sh
builds and runs scripts/mktags.c to create generated/tags.h).

Rob