[Toybox] md5sum cleanup

Sun Jun 8 10:19:12 PDT 2014

On 06/07/14 06:11, Ivo van Poorten wrote:
> I could replace the COMMON macro by a static inline function or just
> duplicate the code, although do not really like the latter.

Worth benchmarking, at least.

>> The sempron is AMD, and it's 64 bit. Saying "IA32" for a sempron is
>> wrong in two distinct ways.
> 
> My Sempron is a Socket-A Sempron 2400+ running at 1.66MHz from late 2004
> and is in fact 32 bit. Linux wrongly detects it as an Athlon MP 1500+
> though, but Socket-A is 32-bit only. Basically they renamed their
> budget line (Duron) to Sempron, just before switching to 64-bit.

My point was nobody says IA32 except Intel. To everybody else it's x86,
which AMD has been making compatible chips for since something like
1982, and which via and winchip and IBM's "blue lightning" and others
wandered through along the way.

>>> The tables can be downsized to 64 bytes, but it'll make it a bit
>>> slower, especially on architectures where non-aligned reads are
>>> slower.
>>
>> Um, example?
> 
> I guess it's not really relevant anymore. I believe Sparc's in the
> 90's had problems with single-byte reads, but I'm not sure. Better
> change the table to 64 bytes until somebody complains.

I remember ARM in 2005 having trouble with single byte reads (gcc
producing mask and shift code for a loop indexed on a byte), but
checking an arm architecture reference manual it says LDRB was there in
all architectures, so that must have been gcc screwing up.

>> Could you get me a patch on top of what's currently there?
> 
> Sure. I'll look into it.
> 
> Perhaps we could introduce a global config flag like the FFmpeg project
> has (CONFIG_SMALL). Not tons of tiny tweaks all over the place, but
> choose either size or speed?

No. https://lwn.net/Articles/597626/

Toybox's main idea is to be simple. We have one implementation of each
command, with one codepath to audit for security problems.

> I think the fastest implementation of md5sum without inline assembly is
> to completely unroll the loop and inline the tables as constants. It'll
> be a lot bigger though.

Trading off size for speed is doable. Trading off conceptual complexity
for speed, I'm not happy about.

Rob

 1402247952.0