<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 11, 2023 at 12:10 PM Rob Landley <<a href="mailto:rob@landley.net">rob@landley.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Rather than bury this in an obscure place on github and never be able to find it<br>

again, in reply to:<br>

<br>

<a href="https://github.com/landley/toybox/commit/aa88571a6b847a96bb8ee998a9868c5a1bdb3a6e#r108474092" rel="noreferrer" target="_blank">https://github.com/landley/toybox/commit/aa88571a6b847a96bb8ee998a9868c5a1bdb3a6e#r108474092</a><br>

<br>

> do you want a static_assert somewhere that toybuf is 4096 bytes? since that's<br>

> not necessarily the page size for arm64, say.<br>

> <br>

> (unrelated, i've been meaning to ask whether we should make toybuf larger<br>

> anyway. 4KiB is really small for modern hardware, though at the same time<br>

> it does make it more likely that we test all the "toybuf too small, loop"<br>

> cases even with small test inputs...)<br>

<br>

A) not a fan of asserts.<br></blockquote><div><br></div><div>i don't like assert(), but **static_assert** is really useful for things like this where you want to say "this code makes an assumption that you can test at compile time".</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

B) it was only ever coincidentally page size, and huge pages are a thing even on<br>

x86.<br></blockquote><div><br></div><div>well, huge pages are different from non-4KiB non-huge pages. i think it's only arm64 where you're at all likely to actually have your page size not be 4KiB. (all macs and iphones, for example. i _think_ all the linux distros that tried to move gave up?)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

I never annotated toybuf or libbuf with any sort of alignment directive or tried<br>

to make it come first in its segment (toybuf and libbuf are the fifth and sixth<br>

globals defined in main.c), so they're both reasonably likely to straddle page<br>

boundaries anyway. Heck, I'm not even sure it's cache line aligned. The actual<br>

_guarantee_ is something like 4 bytes, except when it suddenly isn't. I fought<br>

with this in 2021 trying to get a simple "hello world" kernel out of gcc without<br>

needing a linker script: <a href="https://landley.net/notes-2021.html#12-04-2021" rel="noreferrer" target="_blank">https://landley.net/notes-2021.html#12-04-2021</a></blockquote><div><br></div><div>now you're on C11, you can easily say this: <a href="https://en.cppreference.com/w/c/language/_Alignas">https://en.cppreference.com/w/c/language/_Alignas</a><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

The 4096 is just a convenient scratch pad size. I use sizeof(toybuf) in a bunch<br>

of places... and hardwire in the knowledge of its size in a bunch of others.<br>

Plus there's a bunch of implicit "toybuf and/or this slice of it is big enough<br>

to stick this struct in, so I can safely typecast the pointer" instances I<br>

checked at the time (and all of them had a big fudge factor in case of future<br>

glibc bloat).<br>

<br>

It's really a "convenient granularity" thing. Copy loops doing byte-at-a-time<br>

stuff is known terrible because the library and syscall execution paths come to<br>

dominate, and grouping it into 4k blocks is 12 doublings of efficiency right<br>

there. Going to 64k is 1/16th as much syscalls, which is not as big a deal as<br>

1/4000th as many syscalls. And then raises the question "why not a megabyte<br>

then" which is something you don't just casually want to do on embedded devices<br>

without thinking about it (might as well malloc there)...<br>

<br>

I could probably be talked into bumping it up to 64k if somebody measured<br>

numbers saying it would help something specific? </blockquote><div><br></div><div>i think the time i noticed this was when i was looking into "where the time went" and noticed that a 64KiB buffer was quite helpful, at least on the scale of "an entire Android build" type of thing.</div><div><br></div><div>is it _worth_ it? don't know. what's the _optimal_ size? don't know. (and probably depends on the specific toy, and 4096 is clearly a sensible _lower_ bound...)</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Triaging all the existing users<br>

isn't that big a deal. The linux pipe buffer plumbing changed to collate stuff<br>

so there's some internal copying larger granularity output might help that<br>

wasn't the case 10 years ago... but then we get back into the "output piped to<br>

less displays nothing for 3 minutes, and then it's a screenfull" issue. Line<br>

buffered output is usually like ~60 bytes at a time, 4k is bigger than most<br>

whole text screens (ok, maybe half one of yours but still ballpark). And we can<br>

just as easily malloc a bigger scratch buffer as needed in any case where it<br>

matters...<br>

<br>

Rob<br>

_______________________________________________<br>

Toybox mailing list<br>

<a href="mailto:Toybox@lists.landley.net" target="_blank">Toybox@lists.landley.net</a><br>

<a href="http://lists.landley.net/listinfo.cgi/toybox-landley.net" rel="noreferrer" target="_blank">http://lists.landley.net/listinfo.cgi/toybox-landley.net</a><br>

</blockquote></div></div>