[Toybox] [PATCH] Add support for 1024 as well as 1000 to human_readable.

enh enh at google.com
Thu Sep 3 20:57:20 PDT 2015


On Thu, Sep 3, 2015 at 6:52 PM, Rob Landley <rob at landley.net> wrote:
> On 08/28/2015 09:47 PM, James McMechan wrote:
>>> Date: Mon, 24 Aug 2015 20:47:03 -0500
>>> From: rob at landley.net
>>> To: enh at google.com
>>> CC: toybox at lists.landley.net
>>> Subject: Re: [Toybox] [PATCH] Add support for 1024 as well as 1000 to human_readable.
>>>
>>> On 08/24/2015 03:10 PM, enh wrote:
>>>> On Sun, Aug 23, 2015 at 6:20 PM, Rob Landley <rob at landley.net> wrote:
>>>>> /me wists for a specification. Oh well. I hate when I have to guess at
>>>>> what the right behavior _is_...
>>
>> Well checking back with my copy of "Engineering Fundamentals and Problem Solving" A. Eide et al 1979 Ch 5
>> Engineering units are 0.1 to 999 followed by a space, prefix and SI unit.
>>
>> I am of the opinion that gratious loss of precision should be avoided.
>> Since a one chararacter prefix and decimal point take two character spaces the natural
>> breakpoint would be 10000 e.g. 9998,9999,10 k for SI decimal notation.
>> Using the IEC two character binary prefix Ki/Mi/Gi uses three spaces with the '.'
>> This would however yeild a breakpoint at 100 000 or 10 000 if we use a thousands seperator.
>> Which seems to me a bit large.
>
> I already fixed it a different way (just took me a while to debug and
> check it in), but I see you added a couple more options.
>
> Are these options we actually need? (I.E. expand 1023 and the force use
> of units immediately?) They probably wouldn't be hard to add, but do we
> have anything that actually needs them yet? (Is this compatible with the
> bsd version and thus something we could push the posix guys to
> standardize circa 2030 or so? Ok, more like sometime in the late 2040's.
> Ok, let's face it: I don't engage with the Posix committe much because
> interacting with Jorg Schilling is not something I'm willing to do in a
> hobbyist capacity.)

BSD has (https://www.freebsd.org/cgi/man.cgi?query=humanize_number&sektion=3):

     The following flags may be passed in scale:

  HN_AUTOSCALE    Format the buffer using the lowest multiplier pos-
   sible.

  HN_GETSCALE    Return the prefix index number (the number of
   times number must be divided to fit) instead of
   formatting it to the buffer.

     The following flags may be passed in flags:

  HN_DECIMAL    If the final result is less than 10, display it
   using one decimal place.

  HN_NOSPACE    Do not put a space between number and the prefix.

  HN_B    Use `B' (bytes) as prefix if the original result
   does not have a prefix.

  HN_DIVISOR_1000  Divide number with 1000 instead of 1024.

  HN_IEC_PREFIXES  Use the IEE/IEC notion of prefixes (Ki, Mi,
   Gi...).  This flag has no effect when
   HN_DIVISOR_1000 is also specified.

in the entire tree, there's only one use of HN_GETSCALE
(/usr/bin/procstat), and it doesn't look like that's actually
necessary).

HN_DECIMAL and HN_NOSPACE are used a lot: ls, df, du, and so on. HN_B
is used less, but in df, du, and vmstat. HN_DIVISOR_1000 is only
really used in df (it's also used once each in "edquota" and
"camcontrol").

HN_IEC_PREFIXES isn't used at all. not even a test.

so until we find a place where we want to turn off HN_DECIMAL, we're
good. (that's a harder thing to grep for, but i couldn't find an
instance in FreeBSD.)

>>>> yeah, i was actually trying to avoid ending up with all the heuristics
>>>> the BSD implementation has.
>>>>
>>>> the BSD man page says:
>>>>
>>>> If the formatted number (including suffix) would be too long to fit into
>>>> buf, then divide number by 1024 until it will.
>>>
>>> That's just "test against 999, divide by 1024". Easy enough.
>>>
>>>> The len argument must be at least 4 plus the length of suffix, in order
>>>> to ensure a useful result is generated into buf.
>>>
>>> That constraint's already implicit. I should make sure it's explicit.
>>>
>>>> so it certainly seems they follow the "no more than three digits/two
>>>> digits plus '.'" rule.
>>>
>>> I can work with this.
>>>
>>> Thanks,
>>>
>>> Rob
>>
>> Attached is a patch that should allow for 0..9999, 10 k..999 k, 1.0 M..999 M SI units
>> 0..9999, 9.8 Ki..999 Ki, 1.0 Mi..999 Mi... IEC binary units, note the 9999 -> 9.8 Ki transition
>> I have tested this with LE32 BE32 LE64 while I have BE64 sparc I do not have a BE64 userspace
>> and my other BE64 system is still on order.
>
> If this behaves differently on big or little endian, your compiler is at
> fault. And long long should be 64 bit on 32 bit or 64 bit systems, due
> to LP64. (There's no spec requiring long long _not_ be 128 bit, which is
> a bit creepy, but nobody's actually done that yet that I'm aware of. I
> should probably use uint64_t but the name is horrid and PRI_U64 stuff in
> printf is just awkward, and it's a typedef not a real type the way
> "int", "long", and "long long" are...)
>
>> You can also set a flags to drop the space between number and prefix or use the ubuntu 0..1023 style
>> also you can request the limited range 0..999, 1.0 k-999 k style in either SI or IEC
>
> Yes, but why would we want to?
>
>> This is  pure integer, I could open code the printf also as it can only have 4 digits maximum at the moment.
>> If you want I could make it autosizing rather than just one decimal between 0.1..9.9
>> Also if any of the symbols are defined to 0 the capability will drop out.
>> Perhaps I should make it default to IEC "Ki" style? getting it right vs bug compatibility.
>>
>> I made a testing command e.g. toybox_human_readable_test to allow me to test it.
>
> I had toys/examples/test_human_readable.c which I thought I'd checked in
> a couple weeks ago but apparently forgot to "git add".
>
> (If you git add a file, git diff shows no differences, mercurial diff
> shows it diffed against /dev/null. I'm STILL getting used to the weird
> little behavioral divergences.)
>
>> I hope this is interesting.
>
> It's very interesting and I'm keeping it around in case it's needed. I'm
> just trying to figure out if the extra flags are something any command
> is actually going to use. (And that's an Elliott question more than a me
> question, I never use -h and it's not in posix or LSB.)
>
> Rob
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net



-- 
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.

 1441339040.0


More information about the Toybox mailing list