[Toybox] [PATCH] seq: fast path for integers.

Tue Dec 15 14:05:35 PST 2020

On Sat, Dec 12, 2020 at 3:56 AM Rob Landley <rob at landley.net> wrote:

> On 12/11/20 1:26 PM, enh via Toybox wrote:
> > The question "why do we have setlinebuf(stdout)?" came up on the list
> > recently, and by strange coincidence here it is causing more trouble:
> > performance rather than usability this time. Just removing the line
> > buffering alone results in a significant speedup.
>
> I have a pending change to NOT line buffer by default, and add a
> TOYFLAG_LINEBUF
> to request that in a command's toyflags. I haven't pulled the trigger on it
> because I don't want to screw up AOSP performance without warning you, but
> it
> seems the right thing to do?
>
> If your patch is yanking setlinebuf() globally, then my patch seems the
> right
> way to do that. :)
>

one of these days someone will team up with a bean counter and actually
chase down where the time is being spent, but the only person i know who
ever looked at that has moved on to something else, so it's mainly
pathological cases that get noticed. "why does my build take a minute
longer?" type stuff.

> > The switch to use %lld rather than %f in obvious cases is another major
> > speedup.
>
> I had not realized this command was performance critical. (If so, it seems
> like
> BLOCK buffering is what we want, and possibly the easy thing to do there
> is use
> toybuf as the output buffer...)
>
> > Calling strlen() once on TT.s and using fwrite() rather than using
> > printf() every time wouldn't be noticeable if the other two issues
> > weren't fixed, but is a decent chunk of the remaining time at this point.
>
> Lemme take a look...
>
> $ time toybox seq 1 10000000 > /dev/null
> real    0m7.484s
> user    0m5.008s
> sys     0m2.420s
> $ time seq 1 10000000 > /dev/null
> real    0m0.139s
> user    0m0.124s
> sys     0m0.012s
>
> Yeah, that's significant. But your patch breaks a test:
>
> FAIL: seq padding
> echo -ne '' | seq -s, -w -2 19 120
> --- expected    2020-12-11 23:22:18.577933444 -0600
> +++ actual      2020-12-11 23:22:18.577933444 -0600
> @@ -1 +1 @@
> --02,017,036,055,074,093,112
> +-2,17,36,55,74,93,112
>
> > +  // Can we avoid %f and just use printf("%lld") for a 10x speedup?
> > +  incrementl = increment;
> > +  firstl = first;
> > +  lastl = last;
> > +  easy = incrementl==increment && firstl==first && lastl==last &&
> !FLAG(f) &&
> > +    !TT.precision;
>
> How does this differ from just !FLAG(f) && !TT.precision? What do those
> assignments and comparisons do? (A range check to see if it fits in "int"?
> Shouldn't 64 bit processors be just as fast with "long"?) Maybe...
>

yes, this was a range check to see if it fits in a `long long` (rather than
`int`).

> $ toybox seq 1.e3 | wc -l
> 1000
>
> No, that's still  integer. And debian seems to have changed the "seq 1.0e3"
> upstream behavior, and seq 0 1.111111e3 10000" too. Grrr, moving target.
> (No,
> posix never mentions ANY of this, why would it?)
>
> > GNU seq is still *another* 10x faster on my desktop, and I did experiment
> > with inlining the trivial `n /= 10` integer-to-ascii algorithm (to at
> > least remove some of the remaining printf() overhead) but the savings
> from
> > that were quite small at this point and didn't seem worth the extra
> > code. And while being 100x worse was a human-noticeable thing, being
> > 10x worse than "basically instant even on very large inputs" doesn't
> > seem likely to be something anyone will notice (and we can worry about
> > that if/when they do).
>
> So the main bug here is that sprintf("%f") is pathologically slow? Except
> now
> with your patch applied it's taking even longer for me? Ah, I see: I did a
> different change to main() and you were depending on output to a tty
> defaulting
> to block buffered when we didn't specify, and mine is doing "line buffer or
> block buffer" depending on what the command asked for. Sigh. I should
> buffer
> into toysh myself...
>

(or you should stop trying to work against the grain and just let stdio do
its job!)

> Ok, making my main() look like yours, and applying this patch, and
> ignoring the
> test it breaks:
>
> $ time ./seq 10000000 >/dev/null
>
> real    0m3.442s
> user    0m3.420s
> sys     0m0.008s
>
> That's still kind of sad. Hmmm. Poke poke... you know, the gnu version of
> this
> is really BAD at this:
>
> $ seq -w 0 .3 10
> 00.0
> 00.3
> 00.6
> ...
> 09.3
> 09.6
> 09.9
>
> Right. Wrote a new integer path and made both do block output via toybuf.
> The
> tests pass, including your new one.
>
> Try now?
>

the accidental write() instead of xwrite() broke my build, but other than
that it seems to work. (patch sent.)

this:

  if (i<0) {
    *s++ = '-';
    i = -i;
  }

is wrong for INT_MIN, but we get away with it because we don't take the
fast path for `./toybox seq -2147483646 -1 -2147483648`.

that seems to be because of the `if (ii == 2) dd += increment;` line ---
because although INT_MIN is representable as a double and an int, INT_MIN-1
isn't.

(patch sent.)

Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20201215/c75821a0/attachment-0001.htm>