[Toybox] [PATCH] top: don't report GiB sizes in KiB.

Thu Sep 10 11:42:50 PDT 2020

On Thu, Sep 10, 2020 at 1:42 AM Rob Landley <rob at landley.net> wrote:

> This is sort of moot now, but I composed it and hadn't hit send and it did
> explain what I was thinking at the time, so...
>
> On 9/8/20 1:55 PM, enh wrote:
> >
> >
> > On Mon, Sep 7, 2020 at 1:29 PM Rob Landley <rob at landley.net
> > <mailto:rob at landley.net>> wrote:
> >
> >     On 9/6/20 6:45 AM, Jarno Mäkipää wrote:
> >     > On Sun, Sep 6, 2020 at 12:34 PM Rob Landley <rob at landley.net
> >     <mailto:rob at landley.net>> wrote:
> >     >> Elliott says there's a maximum limit on the number of digits
> users are
> >     willing
> >     >> to parse, and you're saying it's better to just have large blank
> gaps between
> >     >> the numbers than to use that space for anything, AND that the cap
> on the
> >     maximum
> >     >> number of digits is insurmountable rather than using separators
> like
> >     people have
> >     >> been doing for hundreds of years to cope with long numbers in
> "human
> >     readable"
> >     >> output?
> >     >>
> >     >> It's certainly a point of view.
> >     >
> >     > Groups of 3 are indeed easier for eye. I would suggest using
> something
> >     > more sensible like spaces.
> >
> >     Which is not what any country uses by default and thus makes
> everyone equally do
> >     a double take? Egalitarian badness?
> >
> >     Hardwiring it to the esperanto of formats is certainly a suggestion.
> >
> >     > SI system uses spaces as thousands separator, comma and period both
> >     > being valid decimal separator.
> >     > 123 456.789 or 123 456,789
> >
> >     Ok, I'll bite: which countries teach SI to their kids in primary
> school?
> >
> >     >> Bravo. And bionic's
> libc/bionic/locale.gratuitouslycppbutactuallyc says:
> >     >>
> >     >>   // We only support two locales, the "C" locale (also known as
> "POSIX"),
> >     >>   // and the "C.UTF-8" locale (also known as "en_US.UTF-8").
> >     >>
> >     >> So they don't support it either.
> >     >
> >     > C.UTF-8 and en_US.UTF-8 are not same.
> >
> >     I cut and pasted that out of the bionic source.
> >
> >     >> However, if the commas go, why doesn't the period in
> human_readable() go? I
> >     >> don't see how they're conceptually different?
> >
> >     I'm waiting for an opinion from Elliott, which might be a "meh?"
> because it's
> >     not exactly his area either.
> >
> >
> > i actually felt that 5 digits was small enough to not need separators.
> >
> > a couple of things do stand out though. here's toybox and procps-ng
> 3.3.16 on
> > Debian on my middling "real computer":
> >
> >   Mem:   63,978M total,   53,696M used,   10,282M free,     1870M buffers
> >  Swap:   56,095M total,      0.0M used,   56,088M free,   35,929M cached
> >
> > MiB Mem :  63978.8 total,  10287.4 free,  13083.5 used,  40607.9
> buff/cache
> > MiB Swap:  56096.0 total,  56088.2 free,      7.8 used.  49419.2 avail
> Mem
> >
> > on the whole i prefer the toybox output, apart from the bug that gets us
> "1870M"
> > rather than "1,870M" to match the others,
>
> It's not a bug I did that intentionally (both because the comma's always
> been
> optional with only 4 digits and because what do you display if you have
> exactly
> 4 digits of output space?), but I can remove it again. I suppose passing
> HR_COMMAS with dgt 4 is caller error...
>
> > and the weird "0.0M". i'd always use
> > decimals or never use decimals.
>
> I can have the commas flag suppress the tenths?
>
> > toybox doesn't do as good a job on the smallest system available to me
> right now
> > (a phone from a couple of years ago):
> >
> >   Mem:     3931M total,     2149M used,     1782M free,     6336K buffers
> >  Swap:     2948M total,      0.0K used,     2948M free,     1485M cached
> >
> > there's the same 0.0 issue (though the ',' bug cancels itself out here
> because
> > all the fields are consistent and -- imho -- 4 digits is definitely
> readable
> > without separators anyway).
>
> I can make is so the comma is only suppressed when the output size is "4".
> If
> you CAN show 5 digits, add the comma.
>
> > i still dislike that "buffers" is using K where the
> > others are using M.
>
> Hmmm, good point. But the real question is why didn't the earlier fields
> use all
> the space? Ah, I see: I gave it 8 digits and 3,931,000 is 9.
>
> I have a "force megabytes" threshold based on testing the total memory
> size, and
> right now that test is 10 gigs. So I should either make the display size 9
> digits (I have 7 spaces free at the end of the buffers line on an 80 column
> screen so eating 4 is fine), or I could make the megabyte threshold be 1
> gig
> instead, so 999,999 would use 7 digits and those would stay in kilobytes.
>
> I lean towards going to 9 digits, personally. Either way, the units should
> stay
> consistent.
>
> > that to me still seems like the worst issue: i think we
> > should always use the same units (which procps at least seems to
> do, even if
> > they are sometimes KiB and other times MiB). which is basically a
> stronger
> > version of the decimals complaint --- it's a table, and it's really
> weird when
> > different fields in the table are in different units.
>
> The design is short of shifting out from under human_readable(). The above
> probably fixes it, but if this happens again I should step back and
> rethink the
> objective here. (Specifying units instead of autodetecting them,

fwiw, that's how both Android and Chrome's equivalents ended up: two
variants, one with auto-detection and another where the caller passes in
the units.

> possibly a
> version that operates on an array of values instead of one at a time.

(but that's an interesting idea too.)

> There's a
> lot of "measure all the output so it matches up" ala ls, and no generic
> plumbing
> to handle that, but a design for efficient plumbing to handle that is not
> immediately obvious to me. Hmmm...)
>

that's something i've wondered about a couple of  times too, but  i think
that's an orthogonal  problem to this one. ls deliberately mismatches
units, and despite what i've said here about consistency in tables, i think
that's the right behavior for ls. you don't have any reason to  expect that
the entries in ls' output are likely to be at all similar, whereas with top
you're basically carving up your available RAM, so the entries are
effectively "percentages that humans can grok better" :-)

> > i don't think on a 3GiB system that i actually want to know whether i
> have
> > 6.2MiB or 6.3MiB of buffers --- "6" is fine. but even if i did know, i'd
> want to
> > see 6.2 vs 3931.0 rather than switch units mid-row.
>
> Consistency across the group is good, I agree.
>
> >     A proper fix would be a localeconv() in libc that DOESN'T return
> constant stub
> >     info, which is out of scope for toybox. (And is as much an ADB thing
> as a bionic
> >     thing since android seems to be using adb instead of ssh, so that
> would have to
> >     marshall the locale environment variables from the host into the
> target. But I
> >     often "wait for somebody to complain", you complained, and therefore
> I want to
> >     fix it PROPERLY.)
> >
> >     In the meantime, I can add a call to localeconv() that would use ","
> if that
> >     returns "" which means right now it would be a NOP but then it's not
> my fault
> >     it's getting it wrong. And I can test against glibc which does have
> an
> >     overengineered version of this in it. Way back when uClibc had a
> much compressed
> >     format for the localeconv data, but didn't have a database of
> countries and thus
> >     copied its data from glibc, which it couldn't distribute for
> licensing reasons:
>
> Of course the sad part is that these are _strings_ not bytes (utf8!) which
> means
> the buffer size passed into human_readable_long is no longer fixed and
> easily
> calculable unless I cheat and say I only care about the first byte of the
> return
> and if they use utf8 we're outputting gibberish. (Which for right now,
> "wait for
> somebody to complain"...)
>
> The proper fix to that would be to malloc() the result, which is an
> imposition
> on all the callers and I'm gonna punt on that for now.
>
> Sigh, the locale init in main.c is sort of wrong and doesn't address
> LC_NUMERIC
> anyway. Right, do it in the function...
>
> >       https://lists.uclibc.org/pipermail/uclibc/2015-June/049000.html
> >
> >     Rob
> >
> >     P.S. I ranted about this sort of aesthetic issue being something the
> open source
> >     development model can't deal with 10 years ago, almost to the day:
> >
> >       https://landley.net/notes-2010.html#13-08-2010
> >
> >     And included it in my 2013 talk:
> >
> >       https://www.youtube.com/watch?v=SGmtP5Lg_t0#t=11m30s
> >
>
> Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20200910/c89d5c23/attachment.htm>