[Toybox] patch: add built-in versions of sha-2 family hash functions

Rob Landley rob at landley.net
Wed Jun 2 07:15:44 PDT 2021


On 5/31/21 7:17 PM, Dan Brown wrote:
> Hello- here is a patch that provides these hash functions using built-in
> routines instead of relying on the OpenSSL library.

Merged, tightened up by a couple hundred lines, and debugged for big endian.

Very nice. Thank you.

> Tests. Do I need to add some? Please point me to an example.

I genericized the sha1sum tests and symlinked them.

> How to deal with endianness? How should I test the behavior of a big-endian
> system when developing on a little-endian system?

I used the musl-cross-make powerpc-*-cross toolchain and ran it under qemu-ppc.
(*shrug* It was there.) You were doing a couple extra swaps, it worked fine with
those removed.

> I changed the data types to be explicitly sized (eg. uint32_t instead of int).
> Or is this already taken care of as part of the portability functions?

I converted those to unsigned and unsigned long long because toybox uses LP64:

  https://landley.net/toybox/design.html#bits

(Code style thing as much as anything.)

> SHA512 (and SHA384) are a bit tricky. I couldn't figure out how to calculate the
> constants to the full 64-bit precision needed. Let me know if there are data
> types >64bit

There are, but it's probably not worth it.

In theory, long double. In practice, it's not guaranteed to more than 64 bits.
(sizeof(long double) is 16 bytes on x86-64, 12 bytes on i686, and 8 bytes on
powerpc and sh4, at least in the musl-cross-make toolchains.)


> and corresponding math functions (cube root and floor) that are
> possible.

You can do the FOIL math yourself on two 64 bit numbers (although that's an
integer trick, so fixed point), but when you start bringing in roots and trig
it's suddenly a whole lot of work, ala:

  https://www.wikihow.com/Calculate-Cube-Root-by-Hand

And the table is probably smaller anyway. (There's a whole bc implementation in
pending that I need to clear a month to properly analyze, sometime _after_
finishing toysh. And maybe after doing an awk too.)

Right now I'm pondering whether if you build "sha256sum" standalone can the
compiler's constant propogation, inlining, and dead code analysis drop out the
64 bit version's code? Probably not (it's keeping the lower half of the constant
table at least), but given that a dynamically linked sha256sum on x86-64 is
19.6k I'm not really bothered by it. (If I need to fit an sha2 algorithm in a
boot rom, I can revisit that...)

Hmmm...

  $ nm --size-sort generated/unstripped/sha256sum
  000000000000017b t sha1_transform
  0000000000000181 t md5_transform
  0000000000000197 t sha2_32_transform
  00000000000001c1 t sha2_64_transform
  0000000000000280 d sha512nofloat

No, it couldn't, but I'm ok with that for now.

Good work. Thanks again,

Rob


More information about the Toybox mailing list