[Toybox] md5sum cleanup

Ivo van Poorten ivopvp at gmail.com
Sat Jun 7 04:11:10 PDT 2014


On Sun, 01 Jun 2014 12:00:08 -0500 Rob Landley <rob at landley.net> wrote:
> On 05/15/14 11:50, Ivo van Poorten wrote:
> > On Wed, 14 May 2014 21:56:13 -0700 Daniel Verkamp <daniel at drv.nu>
> > wrote:
> >> Here's a quick cleanup of md5sum. Executive summary: smaller and
> >> faster.
> > 
> > Nice! It got me excited to see if I could get it faster.
> 
> Bigger and faster.
> 
> Sorry for the delay evaluating this, but A) it rolls up the previous
> patch I already applied so I have to separate it out to evaluate it,
> B)
> #define COMMON isn't going in.
> 
> I _really_ dislike having a macro like that repeat bits of code. I see
> why you did it, but ew. I'm also confused why the do { XXX } while
> (0); wrapper is needed when the only users aren't in if/else blocks?
> The for has their own curly brackets (for code that's all on the same
> line...)

Yeah, the do{}while(0) wrapper is a left-over from previous
experiments. I always want to be safe instead of sorry, but at this
point they can be removed.

I could replace the COMMON macro by a static inline function or just
duplicate the code, although do not really like the latter.

> I see what you're doing here, and now that it's been pointed out how
> much slower it is than other implementations I agree speeding it up is
> good. (That's why I applied the first patch, even though that table
> repeating each stanza 4 times is just painful. But adding math to the
> dereference to repeat table sections added enough cycles to the inner
> loop to slow it back down to about where it was before...)


> > Times in seconds on my machine (IA32 Sempron 1.66MHz):
> 
> The sempron is AMD, and it's 64 bit. Saying "IA32" for a sempron is
> wrong in two distinct ways.

My Sempron is a Socket-A Sempron 2400+ running at 1.66MHz from late 2004
and is in fact 32 bit. Linux wrongly detects it as an Athlon MP 1500+
though, but Socket-A is 32-bit only. Basically they renamed their
budget line (Duron) to Sempron, just before switching to 64-bit.

> > cur-hg      8.6
> > daniel      6.7
> > ivo         4.9
> > md5sum      2.8
> > openssl     2.7
> 
> The "md5sum" there is your host's existing command line version?

Yes, from GNU coreutils.

> > The tables can be downsized to 64 bytes, but it'll make it a bit
> > slower, especially on architectures where non-aligned reads are
> > slower.
> 
> Um, example?

I guess it's not really relevant anymore. I believe Sparc's in the
90's had problems with single-byte reads, but I'm not sure. Better
change the table to 64 bytes until somebody complains.

> Could you get me a patch on top of what's currently there?

Sure. I'll look into it.

Perhaps we could introduce a global config flag like the FFmpeg project
has (CONFIG_SMALL). Not tons of tiny tweaks all over the place, but
choose either size or speed?

I think the fastest implementation of md5sum without inline assembly is
to completely unroll the loop and inline the tables as constants. It'll
be a lot bigger though.

Regards,
Ivo

 1402139470.0


More information about the Toybox mailing list