[Toybox] [CLEANUP]vmstat

Fri Dec 27 20:12:07 PST 2013

So I started picking at vmstat a while back, and of course it evolved 
into a big rewrite.

First off, the terminal handling is funky. But it's funky upstream too: 
If you run "vmstat 1" in ubuntu it detects the terminal height and 
reprints the headers once per screen, but if you "vmstat 1 | cat" it 
prints them every 25 chars. This is odd: if you "ls | cat" it doesn't 
give you -c output for an 80 character screen, it drops down to "ls -1" 
behavior. Periodically reprinting the header when you cat the output is 
nuts, and there's no standard for what the correct behavior _is_, so the 
question is what will I break if I make it stop doing the crazy thing?

Meanwhile, general cleanup:

Remove _num postfix from variable names. (Storing a number is what 
integer variables do.)

A bunch of different local variable[2] arrays implies we should have a 
structure to group them, so make a struct vmstat_proc. (This information 
can probably be shared with "top", so thinking about reusability...) 
Every element in said structure is a uint64_t so we can refer to them by 
name but _also_ use it as a giant array to do table driven logic. (It's 
a bit subtle, but of the kind that deserves a comment.)

Speaking of table driven, still in main():

replace big "%2d %3d %4d" string with an array of lengths. This is used 
both for padding the header and for padding the values. (The header 
padding doesn't directly buy us much over a spaced string, but it means 
if any of the spacings change it only needs to change in one place. I 
left the ---string--- one alone because the repetition vs code tradeoff 
isn't worth it.)

Up top: collate the three separate /proc reading functions with one big 
get_vmstat_proc() that fills out the function working off of a table. 
The table is an array of strings, entries that start with / mean open a 
new file, the rest mean search for this and expect a decimal long right 
after.

I could have tried to eliminate the trailing space and : that's repeated 
a lot in the table, but it complicated the code enough it wasn't worth 
it. Specifically, searching for "name" and then sscanf(" %d") with the 
space in the sscanf opens the possibility of a search for "name " 
incorrectly triggering on "names " or similar, I want to match the whole 
string, or at least the end of it. Pondered making sure it followed a 
newline to ensure whole word match, but the first entry isn't guaranteed 
to have a newline and I'm not dragging regexes into it.

Similarly, I could have done the \0 thing instead of the {"one", "two"} 
thing, but this table is long and complicated enough I want to be able 
to _read_ it. Size vs simple tradeoff, not an obvious answer there. (I 
did it both ways in this command actually.)

Grrr. That broken "may be used uninitialized" warning is triggering 
twice in get_vmstat_proc() because gcc doesn't understand that the table 
entries are parsed in order, and processing the first entry will 
initialize "name" (and nothing else uses it before then) and processing 
the second will initialize "p" (and nothing else uses it before then), 
and it's completely deterministic and the condition it's warning about 
CAN'T HAPPEN. But it won't shut up unless we "char *p = p, *name = 
name;" just to humor it. (Unlike initializing them to 0, initializing 
them to themselves generates no code but lets the compiler know that we 
consider them initialized so it should SHUT UP about the stupid warning 
it can't comptently generate.)

(I mean yes, the table parser could be fed a bad table. If you look for 
a key before specifying a filename, or look for another entry on the 
same line as the last key without having had a key yet, it would be 
using an uninitialized variable. But the table we feed it, which is a 
constant in the same function, doesn't do that. Generating false 
positives is not the compiler's job, it's coverity's job. External 
"harser than the compiler" code analyzers are as old as "lint", this is 
not gcc's problem.)

On to runtime problems! The output spacing is funky due to modern memory 
sizes: they allocated 6 digits each (kilobyte units) for swapped, free, 
buffered, and cached. Once you have a gigabyte value in any of them, it 
uses an extra space and then all the remaining columns are offset a 
space, even if they have plenty of blank spaces in front of them. So 
track overshoots and deduct from future columns' whitespace budget. 
(Yeah, fancy thing upstream isn't doing. But upstream is stale.)

Next up... all these unit conversions are funky and complicated and I 
don't understand what's going on. What are the units on any of this?

According to the man page, swap I/O is in kilobytes per second, but disk 
I/O is in blocks/second. How big is a block these days, 512 bytes or 4k? 
No wait, it doesn't say, that, the man page doesn't _say_ kilobytes for 
swap I/O, it says "memory/second" as if _memory_ is a _unit_. So maybe 
that's blocks too! No way to tell! Grrrrr...

Very strongly tempted to just normalize them both to kilobytes and add a 
command line switch to let you say megabytes. Also, why the heck is disk 
I/O special? (What about network filesystems? Is that not I/O?)

Wandered off on a longish tangent trying to figure out the units on the 
CPU time usage (it's not
jiffies, it's user_hz, and user_hz/sec is returned by 
sysconf(_SC_CLK_TCK)), but we don't need that
because it's just percentage used in each.

Fluff out the help message a lot to explain what the fields are, 
including units.

Problem: the middle fields (si so bi bo in cs) don't remotely match what 
the other vmstat is producing:

landley at driftwood:~/toybox/clean$ ./vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- 
----cpu----
  r  b   swpd   free   buff  cache   si   so	bi	bo   in   cs us sy id wa
  1  0 268044 434940 367296 2079916    4    4    30    36  318  750 37 
7 60  1
landley at driftwood:~/toybox/clean$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- 
----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa
  0  0 268044 433932 367304 2079912    9   10    68    81    6    8 35 
7 57  1

That's before I started this current round of cleanups: swap out is 4 in 
toybox and 10 in ubuntu, that's not even a clean doubling that's just 
way different. Hmmm...

$ vmstat && cat /proc/vmstat | grep pswpout && cat /proc/uptime
procs -----------memory---------- ---swap-- -----io---- -system-- 
----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy 
id wa
  0  0 268040 416984 367356 2080192    9   10    68    81    6    9 35 
7 57  1
pswpout 1608723
1494710.37 700695.57
landley at driftwood:~/toybox/clean$ python
Python 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> print (1608723*4)/700695.0
9.18358486931

Nope, even using the second uptime value (which isn't documented, but 
might be uptime subtracting suspend time? The kernel code implies it's 
"idle" time but I dunno what it means by that. I asked on linux-kernel, 
http://lkml.indiana.edu/hypermail/linux/kernel/1312.3/00295.html but 
nobody's replied yet. It seems like adding up all processors on an SMP 
system, which the kernel is doing, would potentially give you a value 
GREATER than uptime, but... dunno.)

Anyway, even using that I can't _quite_ match the other uptime because 
9.18 doesn't even round up to 10. Except... Hang on, doing the si value:

   >>> print (1561842*4)/701092.0
   8.91091040833

Which displayed 9 when I did that. So maybe it's _aggressively_ rounding 
up, anything greater than an exact match is the next integer? Hmmm... 
No, I implemented that and I'm getting "9 9" where the other
is giving "9 10". Based on:

   pgpgin  45355001
   pgpgout  54524516
   /proc/uptime 1645947.71 773585.92

I have no flipping CLUE what the upstream version thinks it's doing.

Right, check in what I've got and research more later.

Rob