<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Dec 12, 2020 at 3:56 AM Rob Landley <<a href="mailto:rob@landley.net">rob@landley.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 12/11/20 1:26 PM, enh via Toybox wrote:<br>
> The question "why do we have setlinebuf(stdout)?" came up on the list<br>
> recently, and by strange coincidence here it is causing more trouble:<br>
> performance rather than usability this time. Just removing the line<br>
> buffering alone results in a significant speedup.<br>
<br>
I have a pending change to NOT line buffer by default, and add a TOYFLAG_LINEBUF<br>
to request that in a command's toyflags. I haven't pulled the trigger on it<br>
because I don't want to screw up AOSP performance without warning you, but it<br>
seems the right thing to do?<br>
<br>
If your patch is yanking setlinebuf() globally, then my patch seems the right<br>
way to do that. :)<br></blockquote><div><br></div><div>one of these days someone will team up with a bean counter and actually chase down where the time is being spent, but the only person i know who ever looked at that has moved on to something else, so it's mainly pathological cases that get noticed. "why does my build take a minute longer?" type stuff.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> The switch to use %lld rather than %f in obvious cases is another major<br>
> speedup.<br>
<br>
I had not realized this command was performance critical. (If so, it seems like<br>
BLOCK buffering is what we want, and possibly the easy thing to do there is use<br>
toybuf as the output buffer...)<br>
<br>
> Calling strlen() once on TT.s and using fwrite() rather than using<br>
> printf() every time wouldn't be noticeable if the other two issues<br>
> weren't fixed, but is a decent chunk of the remaining time at this point.<br>
<br>
Lemme take a look...<br>
<br>
$ time toybox seq 1 10000000 > /dev/null<br>
real 0m7.484s<br>
user 0m5.008s<br>
sys 0m2.420s<br>
$ time seq 1 10000000 > /dev/null<br>
real 0m0.139s<br>
user 0m0.124s<br>
sys 0m0.012s<br>
<br>
Yeah, that's significant. But your patch breaks a test:<br>
<br>
FAIL: seq padding<br>
echo -ne '' | seq -s, -w -2 19 120<br>
--- expected 2020-12-11 23:22:18.577933444 -0600<br>
+++ actual 2020-12-11 23:22:18.577933444 -0600<br>
@@ -1 +1 @@<br>
--02,017,036,055,074,093,112<br>
+-2,17,36,55,74,93,112<br>
<br>
> + // Can we avoid %f and just use printf("%lld") for a 10x speedup?<br>
> + incrementl = increment;<br>
> + firstl = first;<br>
> + lastl = last;<br>
> + easy = incrementl==increment && firstl==first && lastl==last && !FLAG(f) &&<br>
> + !TT.precision;<br>
<br>
How does this differ from just !FLAG(f) && !TT.precision? What do those<br>
assignments and comparisons do? (A range check to see if it fits in "int"?<br>
Shouldn't 64 bit processors be just as fast with "long"?) Maybe...<br></blockquote><div><br></div><div>yes, this was a range check to see if it fits in a `long long` (rather than `int`).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
$ toybox seq 1.e3 | wc -l<br>
1000<br>
<br>
No, that's still integer. And debian seems to have changed the "seq 1.0e3"<br>
upstream behavior, and seq 0 1.111111e3 10000" too. Grrr, moving target. (No,<br>
posix never mentions ANY of this, why would it?)<br>
<br>
> GNU seq is still *another* 10x faster on my desktop, and I did experiment<br>
> with inlining the trivial `n /= 10` integer-to-ascii algorithm (to at<br>
> least remove some of the remaining printf() overhead) but the savings from<br>
> that were quite small at this point and didn't seem worth the extra<br>
> code. And while being 100x worse was a human-noticeable thing, being<br>
> 10x worse than "basically instant even on very large inputs" doesn't<br>
> seem likely to be something anyone will notice (and we can worry about<br>
> that if/when they do).<br>
<br>
So the main bug here is that sprintf("%f") is pathologically slow? Except now<br>
with your patch applied it's taking even longer for me? Ah, I see: I did a<br>
different change to main() and you were depending on output to a tty defaulting<br>
to block buffered when we didn't specify, and mine is doing "line buffer or<br>
block buffer" depending on what the command asked for. Sigh. I should buffer<br>
into toysh myself...<br></blockquote><div><br></div><div>(or you should stop trying to work against the grain and just let stdio do its job!)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Ok, making my main() look like yours, and applying this patch, and ignoring the<br>
test it breaks:<br>
<br>
$ time ./seq 10000000 >/dev/null<br>
<br>
real 0m3.442s<br>
user 0m3.420s<br>
sys 0m0.008s<br>
<br>
That's still kind of sad. Hmmm. Poke poke... you know, the gnu version of this<br>
is really BAD at this:<br>
<br>
$ seq -w 0 .3 10<br>
00.0<br>
00.3<br>
00.6<br>
...<br>
09.3<br>
09.6<br>
09.9<br>
<br>
Right. Wrote a new integer path and made both do block output via toybuf. The<br>
tests pass, including your new one.<br>
<br>
Try now?<br></blockquote><div><br></div><div>the accidental write() instead of xwrite() broke my build, but other than that it seems to work. (patch sent.)</div><div><br></div><div>this:</div><div><br></div><div> if (i<0) {<br> *s++ = '-';<br> i = -i;<br> }<br></div><div> </div><div>is wrong for INT_MIN, but we get away with it because we don't take the fast path for `./toybox seq -2147483646 -1 -2147483648`.</div><div><br></div><div>that seems to be because of the `if (ii == 2) dd += increment;` line --- because although INT_MIN is representable as a double and an int, INT_MIN-1 isn't.</div><div><br></div><div>(patch sent.)</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Rob<br>
</blockquote></div></div>