[Toybox] [PATCH] count.c: Human readable -h option and MAYFORK

enh enh at google.com
Thu Oct 26 16:31:36 PDT 2023


On Thu, Oct 19, 2023 at 11:01 PM Rob Landley <rob at landley.net> wrote:
>
> On 10/19/23 13:20, enh wrote:
> >> > if you had
> >> > ```
> >> > if (which->flags & TOYFLAG_LINEBUF) setvbuf(stdout, 0, _IOLBF, 0);
> >> > ```
> >> > we'd argue a lot less about buffering :-)
> >>
> >> The existing LINEBUF users are ascii, base64, base32, yes, echo, grep, egrep, fgrep.
> >
> > yeah, i think we're in agreement about line buffering being reasonable for grep.
> >
> > i'm just unconvinced that _no_ buffering is the right default...
>
> I'm unconvinced there IS a right default.

"me too", which is part of why i argue for the status quo. "at least
it's the suck we're used to, rather than new suck."

(but i think https://www.gnu.org/software/coreutils/manual/html_node/stdbuf-invocation.html
is supposed to be the workaround for the use case you usually mention
as justification for no buffering? if you don't like the
libc-specificness of that, you could always add a stdbuf _toy_?)

> I was originally leaving it to libc to provide a default, but A) it was
> inconsistent between different libc's and even release versions, B) upgrades
> broke "less" on more than one occasion, not just for toybox but within distros
> I've used. (As in I've seen Red Hat break this and Ubuntu break this over the
> years, because they trusted libc's buffering to do something predictable.)
>
> I'm going to have to audit all the commands to categorize "produces progressive
> output" and "does not produce progressive output" (which is sadly kind of HARD
> because... wget? I mean... it COULD...) and then subset those into "writes to a
> file descriptor" (which doesn't let it off the hook) and "writes through stdio",
> and for each figure out what the performance looks like vs other implementations
> and whether I can/should easily improve it.
>
> Mostly, I wait for somebody to complain. But since the android build has already
> had more than one performance bottleneck I've already had to tweak toybox for,
> "performance" is bubbling up higher on my todo list. The gestalt of existing
> complaints has accumulated some momentum.
>
> >> Both ascii and echo pretty much want a single big output buffer, they do not
> >> produce progressive output so it might as well all be collated into one big
> >> output buffer and then flushed.
> >>
> >> I just rewrote yes not to use stdio at all, it does writev() to fd 1.
> >
> > ...not least because you end up doing stuff like this.
>
> Because the one _with_ the line buffering was two orders of magnitude slower
> than debian's, and writev() lets me multiply the output-per-syscall without
> allocating an unbounded amount of extra memory for copies.
>
> Right now 30% slower doesn't matter and 50% slower is only if somebody
> _specifically_ complains, but 10x slower? Or more? Harder to ignore.
>
> > i worry that this might cause regressions too (based on the commit
> > message from the change that made yes line-buffered):
>
> Debian's is doing 3g/s output on my laptop which means it's using large
> transaction sizes. Let's strace it:
>
> write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
> write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
> write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
> write(1, "y\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\ny\n"..., 8192) = 8192
>
> Outputting more than one y\n per transaction can't break stuff or the debian one
> would have triggered it.

the "stuff" is a toybox xargs test according to that commit message
... i don't think anyone's running that heavily with debian yes(1).

> > 1.
> >
> > (though i will accept the argument that "that's a toybox xargs test
> > bug", but (a) we'll still need to fix that test
>
> Any idea off the top of your head which of the 31 tests it was? (I can use the
> git commit time to go dig back through the web archive and my blog and such, but
> not right now. I added a TODO note to my tests/xargs.test file to remind me.)

"if it happened more than a week ago, unlikely" :-)

> > and (b) if we're going
> > to go that way, did you measure what just using the trivial yes
> > implementation and default stdio buffering gets you?)
>
> $ cat yas.c
> #include <stdio.h>
>
> int main(int argc, char *argv[])
> {
>   for (;;) printf("y\n");
> }
> $ gcc yas.c
> $ timeout 1 ./a.out | ./count -l >/dev/null
>
> It wrote 99m. The iovec one is writing 2.7G on the same hardware.
>
> Oddly enough your transaction size can be TOO big and slow down a little from
> the peak. I haven't looked at the kernel pipe implementation to see when it
> zerocopies provided buffers and when it copies them to chop them up, mostly
> because that really really really really should be the kernel guys' problem, and
> is almost guaranteed to be different in 5 years, and probably varies by hardware
> anyway depending on how expensive page table fiddling is and whether "can I just
> block the producer by not returning from the write() until the consumer's
> accepted the zerocopy data out of the pipe buffer" is an acceptable strategy or
> what...
>
> I am thus NOT attempting to calculate an optimal iovec[] length based on the
> provided command line string data, I'm just saying I was aware of the option.
> (But that way lies madness and gnu.)
>
> Rob
>
> P.S. Count's X/s time output is actually X/period and the period is 250ms, but
> it's also zeroing new periods and advancing into the zeroed period immediately
> in the calculation as a full period when we've only been there for less than a
> full 250ms, and what it SHOULD do is look at the remainder of millitime()%250
> and use that to add a fraction of the oldest period (not otherwise included, and
> not included as a period in the divisor) so "however much data the new one
> accumulated in less than a full period of time" plus "the reciprocal portion of
> the old period we left" should equal a coherent count.
>
> P.P.S. Yes, I remember when this was 'while (0<(len = read(0, buf,
> sizeof(buf)))) { write(1, buf, len); dprintf(2, "%llu\n", total += len); }' or
> possibly yes.c I remember...


More information about the Toybox mailing list