[Toybox] sizeof(toybuf)

Rob Landley rob at landley.net
Thu Apr 20 20:06:17 PDT 2023


On 4/20/23 15:31, enh wrote:
> On Wed, Apr 19, 2023 at 8:44 PM Rob Landley <rob at landley.net
>     > static_assert is a compile-time check. there's currently a dependency in the
>     > checksum code on sizeof(toybox) being < 5k-ish.
> 
>     Really? Checking a compile time constant at runtime doesn't sound like my style.
>     (As in I'd have removed it during cleanup if I'd noticed, unless I _expect_ the
>     test to optimize out and be a de-facto assert.)
> 
> you're the one who forked this from the original discussion on the specific line
> of code in question :-P
> 
> https://github.com/landley/toybox/commit/aa88571a6b847a96bb8ee998a9868c5a1bdb3a6e#r108474092 

Ah. Comment, not actual test. Right.

A static assert isn't going to help in that case because you could feed a
dynamically allocated buffer to adler32() with an arbitrary length.

Mostly the comment was me going "I could have done a for loop here but I haven't
got a user that's gonna CARE about bigger sizes yet". The immediate user (still
not wired up, although zlib output mode would be used by git.c) is deflate.c,
which is passing in something like 258 bytes at a time max. I haven't started on
rsync.c yet (rolling adler32 is the middle of the three hash functions there).

The comment just reminds me not to feed the function a buffer bigger than toybuf
unless I implement the test-and-loop. If I do change anything here it would be
implementing the test and loop, but nothing uses it yet and this still MIGHT
just get inlined in its users if there are only ever gonna be two and one of
them's rolling.

> the whole reason i suggested static_assert is because at the moment all the
> "protection" we have is a comment.

Agreed, but while I've got a window open to do the simple wrapper version of
gzip --rsyncable I don't think that actually needs to call this function at all.
Both the original 1999 post and the gzip code on github web view are just doing
addition until they get 12 bits of zeroes, because adler32 is modifying high
bits that never looks at. I'd have to recheck but I don't think they're even
doing the modulus, so an "if ((x += y)&4095)" loop is gonna be faster than the
"proper" adler32 that's updating bits we don't check. (Technically it probably
should modulus if it goes high enough because that gives us better hashing
characteristics, but we've got to pass 16 opportunities for zeroes before it
even comes up and "do what the others do" would be the right thing here...)

I just updated adler32() because I'd done the research. Still need to do a
rolling one...

>     But I'm unaware of such a check...?
> 
> exactly :-P

I agree the comment probably isn't sufficiently load bearing, but there aren't
any users of the function yet. :)

> to go back to the beginning since this thread got far too confused: "if you
> don't do this because the #if .. #error dance is too annoying/verbose,
> static_assert lets you say this very concisely, and you're already requiring C11
> anyway".

Indeed. But I'm not doing the #if version anywhere yet either. :)

Rob


More information about the Toybox mailing list