[Toybox] [PATCH] Add support for 1024 as well as 1000 to human_readable.
James McMechan
james_mcmechan at hotmail.com
Fri Aug 28 19:47:52 PDT 2015
> Date: Mon, 24 Aug 2015 20:47:03 -0500
> From: rob at landley.net
> To: enh at google.com
> CC: toybox at lists.landley.net
> Subject: Re: [Toybox] [PATCH] Add support for 1024 as well as 1000 to human_readable.
>
> On 08/24/2015 03:10 PM, enh wrote:
>> On Sun, Aug 23, 2015 at 6:20 PM, Rob Landley <rob at landley.net> wrote:
>>> /me wists for a specification. Oh well. I hate when I have to guess at
>>> what the right behavior _is_...
Well checking back with my copy of "Engineering Fundamentals and Problem Solving" A. Eide et al 1979 Ch 5
Engineering units are 0.1 to 999 followed by a space, prefix and SI unit.
I am of the opinion that gratious loss of precision should be avoided.
Since a one chararacter prefix and decimal point take two character spaces the natural
breakpoint would be 10000 e.g. 9998,9999,10 k for SI decimal notation.
Using the IEC two character binary prefix Ki/Mi/Gi uses three spaces with the '.'
This would however yeild a breakpoint at 100 000 or 10 000 if we use a thousands seperator.
Which seems to me a bit large.
>> yeah, i was actually trying to avoid ending up with all the heuristics
>> the BSD implementation has.
>>
>> the BSD man page says:
>>
>> If the formatted number (including suffix) would be too long to fit into
>> buf, then divide number by 1024 until it will.
>
> That's just "test against 999, divide by 1024". Easy enough.
>
>> The len argument must be at least 4 plus the length of suffix, in order
>> to ensure a useful result is generated into buf.
>
> That constraint's already implicit. I should make sure it's explicit.
>
>> so it certainly seems they follow the "no more than three digits/two
>> digits plus '.'" rule.
>
> I can work with this.
>
> Thanks,
>
> Rob
Attached is a patch that should allow for 0..9999, 10 k..999 k, 1.0 M..999 M SI units
0..9999, 9.8 Ki..999 Ki, 1.0 Mi..999 Mi... IEC binary units, note the 9999 -> 9.8 Ki transition
I have tested this with LE32 BE32 LE64 while I have BE64 sparc I do not have a BE64 userspace
and my other BE64 system is still on order.
You can also set a flags to drop the space between number and prefix or use the ubuntu 0..1023 style
also you can request the limited range 0..999, 1.0 k-999 k style in either SI or IEC
This is pure integer, I could open code the printf also as it can only have 4 digits maximum at the moment.
If you want I could make it autosizing rather than just one decimal between 0.1..9.9
Also if any of the symbols are defined to 0 the capability will drop out.
Perhaps I should make it default to IEC "Ki" style? getting it right vs bug compatibility.
I made a testing command e.g. toybox_human_readable_test to allow me to test it.
I hope this is interesting.
Jim McMechan
-------------- next part --------------
diff -Nuwdr toybox.old/lib/lib.c toybox/lib/lib.c
--- toybox.old/lib/lib.c 2015-02-25 18:42:24.000000000 -0800
+++ toybox/lib/lib.c 2015-08-28 18:31:30.612911576 -0700
@@ -862,25 +862,72 @@
closedir(dp);
}
-// display first few digits of number with power of two units, except we're
-// actually just counting decimal digits and showing mil/bil/trillions.
-int human_readable(char *buf, unsigned long long num)
+// display numbers like 972 KiB or 1.2 MiB for humans to read.
+int human_readable(char *buf, uint64_t num, int flags)
{
- int end, len;
+ // either 1000 for SI units or 1024 for 2^10 ComSci units
+ int divisor = (flags & HR_SI) ? 1000 : 1024;
- len = sprintf(buf, "%lld", num);
- end = ((len-1)%3)+1;
- len /= 3;
+ // if we care about k vs K and yes it would require uint128 before Z,Y or overflow #
+ const char *units = (flags & HR_SI) ? "\0kMGTPEZY#" : "\0KMGTPEZY#";
- if (len && end == 1) {
- buf[2] = buf[1];
- buf[1] = '.';
- end = 3;
+ // 0-9999 as themselves otherwise lose 2 digits of presision for 1000-9999
+ int final_scale = 999;
+ int scale = (flags & HR_SHRINK_RANGE) ? final_scale : 9999;
+
+ // if we want the range 0..1023[KMGTPEZY] instead. note requires more digits
+ // and acts as HR_SHRINK_RANGE if not using ComSci units
+ if (flags & HR_EXPAND_1023) final_scale = scale = divisor - 1;
+
+ // fixed is fixed point @ 100X will have after scaling and rounding a range of 50-999950
+ // overflowed values will be discarded because it will require scaling
+ int fixed = (100 * num) + 50;
+
+ // if we require scaling for true number or becuase of rounding on fixed point number
+ while ((num > scale) || (fixed > ((100 * scale)+99))) {
+ // now since we are scaled we can only use the limited digits + suffix
+ scale = final_scale;
+
+ // now compute new fixed value before changing num
+ fixed = ((100 * num) / divisor);
+
+ // scale the true number for testing
+ num /= divisor;
+
+ // rounding if > 9.99 add 0.5 ( * 100 since this is fixed point ) e.g. XX.5 values
+ // otherwisel add 0.05 ( * 100 since this is fixed point ) e.g. X.X5 values
+ if (fixed > 999 ) fixed += 50;
+ else fixed += 5;
+
+ if (units[1]) units++; // stop before we run out of string
}
- buf[end++] = ' ';
- if (len) buf[end++] = " KMGTPE"[len];
- buf[end++] = 'B';
+
+ int end;
+ end = sprintf(buf, "%d", fixed/100);
+
+ // if we have scaled then check for decimal point and add suffix
+ if (*units) {
+ // if the fixed value is less than 10 we can have a decimal point
+ // otherwise just stick on the space and units after 10-999
+ if (fixed < 1000) {
+ buf[end++] = '.';
+ buf[end++] = '0'+ ((fixed / 10) % 10);
+ }
+
+ // engineering practice is to always have a space before units
+ if (!(flags & HR_NO_SPACE)) buf[end++] = ' ' ;
+
+ // now add the unit prefixs
+ buf[end++] = *units;
+
+ // SI units wants to use KiB MiB etc. often we don't
+ if ((flags & (HR_FULL_SI | HR_SI)) == HR_FULL_SI ) buf[end++] = 'i';
+ } else if (!(flags & HR_NO_SPACE)) buf[end++] = ' '; // no units add space before "B"
+
+ buf[end++] = 'B'; // is it allways going to be bytes?
buf[end++] = 0;
return end;
}
+
+
diff -Nuwdr toybox.old/lib/lib.h toybox/lib/lib.h
--- toybox.old/lib/lib.h 2015-02-25 18:42:24.000000000 -0800
+++ toybox/lib/lib.h 2015-08-28 17:18:44.608481699 -0700
@@ -171,7 +171,22 @@
void base64_init(char *p);
int terminal_size(unsigned *x, unsigned *y);
int yesno(char *prompt, int def);
-int human_readable(char *buf, unsigned long long num);
+int human_readable(char *buf, uint64_t num, int flags);
+
+// use units of 1000 instead of 1024
+#define HR_SI 1
+
+// stick the i in KiB
+#define HR_FULL_SI 2
+
+// use prefixes as soon as possbile even losing presision
+#define HR_SHRINK_RANGE 4
+
+// use 1023M instead of 1.0G let it use extra digits for 1024 based numbers
+#define HR_EXPAND_1023 8
+
+// don't put the space before the units e.g. 10kB not 10 kB
+#define HR_NO_SPACE 16
// net.c
int xsocket(int domain, int type, int protocol);
diff -Nuwdr toybox.old/toys/pending/dd.c toybox/toys/pending/dd.c
--- toybox.old/toys/pending/dd.c 2015-02-25 18:42:24.000000000 -0800
+++ toybox/toys/pending/dd.c 2015-08-28 14:02:15.837959177 -0700
@@ -133,9 +133,9 @@
//out to STDERR
fprintf(stderr,"%llu+%llu records in\n%llu+%llu records out\n", st.in_full, st.in_part,
st.out_full, st.out_part);
- human_readable(toybuf, st.bytes);
+ human_readable(toybuf, st.bytes, 0);
fprintf(stderr, "%llu bytes (%s) copied,",st.bytes, toybuf);
- human_readable(toybuf, st.bytes/seconds);
+ human_readable(toybuf, st.bytes/seconds, 0);
fprintf(stderr, "%f seconds, %s/s\n", seconds, toybuf);
}
diff -Nuwdr toybox.old/toys/pending/human_readable_test.c toybox/toys/pending/human_readable_test.c
--- toybox.old/toys/pending/human_readable_test.c 1969-12-31 16:00:00.000000000 -0800
+++ toybox/toys/pending/human_readable_test.c 2015-08-28 17:46:28.778278657 -0700
@@ -0,0 +1,47 @@
+/* human_readable_test.c - test stub for human_readable
+ *
+ * Copyright 2015 James McMechan
+
+USE_HUMAN_READABLE_TEST(NEWTOY(human_readable_test, "SisEC", TOYFLAG_USR|TOYFLAG_SBIN))
+
+config HUMAN_READABLE_TEST
+ bool "human readable test"
+ default y
+ help
+ usage: human_readable_test [-SisEC] [VALUES]...
+
+ Shows values in human readable form for testing the human_readable function
+
+ -S use SI units 1 000 = k, 1 000 000 = M... instead of IEC units 1024 = K, 1024 * 1024 = M
+ -i display the IEC /i/ in prefix if used
+ -s remove space before prefix units
+ -E expand IEC range 0..1023
+ -C collaspe range to only 0..999
+*/
+
+#define FOR_human_readable_test
+#include "toys.h"
+
+void human_readable_test_main(void)
+{
+ int i;
+ int flags = 0;
+
+ if (toys.optflags & FLAG_S) flags |= HR_SI;
+
+ if (toys.optflags & FLAG_i) flags |= HR_FULL_SI;
+
+ if (toys.optflags & FLAG_s) flags |= HR_NO_SPACE;
+
+ if (toys.optflags & FLAG_E) flags |= HR_EXPAND_1023;
+
+ if (toys.optflags & FLAG_C) flags |= HR_SHRINK_RANGE;
+
+ for (i=0; i < toys.optc; i++) {
+ uint64_t num;
+ char buf[64];
+ num = strtoull(toys.optargs[i],0,0);
+ human_readable(buf,num,flags);
+ printf("%s\n",buf);
+ }
+}
diff -Nuwdr toybox.old/toys/posix/du.c toybox/toys/posix/du.c
--- toybox.old/toys/posix/du.c 2015-02-25 18:42:24.000000000 -0800
+++ toybox/toys/posix/du.c 2015-08-28 14:01:00.904083828 -0700
@@ -55,7 +55,7 @@
if (TT.maxdepth && TT.depth > TT.maxdepth) return;
if (toys.optflags & FLAG_h) {
- human_readable(toybuf, size);
+ human_readable(toybuf, size, 0);
printf("%s", toybuf);
} else {
int bits = 10;
More information about the Toybox
mailing list