[Toybox] md5sum cleanup
Daniel Verkamp
daniel at drv.nu
Wed May 14 21:56:13 PDT 2014
Here's a quick cleanup of md5sum. Executive summary: smaller and faster.
On my machine, for a 2.2 GB file of random bytes, the timings with
warm cache are:
toybox before: 11.4 seconds
toybox after: 8.3 seconds
GNU md5sum: 3.9 seconds
openssl dgst -md5: 3.5 seconds
This is clearly better than before (3x openssl), but still slow (2x openssl).
I suspect there is more low-hanging fruit to be had by eliminating the
memcpy in hash_update (maybe not too much - hash_update accounts for
about 4% of total runtime versus 92% for md5_transform according to
perf - but this would also help sha1sum).
make bloatcheck on x86_64 gcc 4.8.2 -Os:
name old new delta
-----------------------------------------------------------------------
md5rot 0 64 64
md5_transform 365 223 -142
-----------------------------------------------------------------------
-78 total
Rationale for the changes:
Move definition of 'rol' up so it can be used in md5_transform. This
is purely cosmetic; it expands to exactly the same code.
Put rotation counts in a lookup table instead of calculating them on
the fly. This is mostly a wash size-wise, +5 bytes total, but
worthwhile for readability and speed.
Instead of accessing the state array using a rotating index (the
variable formerly known as 'a'), access the state with constant
offsets and rotate the contents of the array instead. This is the big
win - it eliminates all the crazy memory addressing math inside the
loop.
Thanks,
-- Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: md5sum-cleanup.patch
Type: application/octet-stream
Size: 2682 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20140514/c0abd296/attachment.obj>
More information about the Toybox
mailing list