<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Apr 11, 2023 at 12:10 PM Rob Landley <<a href="mailto:rob@landley.net">rob@landley.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Rather than bury this in an obscure place on github and never be able to find it<br>
again, in reply to:<br>
<br>
<a href="https://github.com/landley/toybox/commit/aa88571a6b847a96bb8ee998a9868c5a1bdb3a6e#r108474092" rel="noreferrer" target="_blank">https://github.com/landley/toybox/commit/aa88571a6b847a96bb8ee998a9868c5a1bdb3a6e#r108474092</a><br>
<br>
> do you want a static_assert somewhere that toybuf is 4096 bytes? since that's<br>
> not necessarily the page size for arm64, say.<br>
> <br>
> (unrelated, i've been meaning to ask whether we should make toybuf larger<br>
> anyway. 4KiB is really small for modern hardware, though at the same time<br>
> it does make it more likely that we test all the "toybuf too small, loop"<br>
> cases even with small test inputs...)<br>
<br>
A) not a fan of asserts.<br></blockquote><div><br></div><div>i don't like assert(), but **static_assert** is really useful for things like this where you want to say "this code makes an assumption that you can test at compile time".</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
B) it was only ever coincidentally page size, and huge pages are a thing even on<br>
x86.<br></blockquote><div><br></div><div>well, huge pages are different from non-4KiB non-huge pages. i think it's only arm64 where you're at all likely to actually have your page size not be 4KiB. (all macs and iphones, for example. i _think_ all the linux distros that tried to move gave up?)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
I never annotated toybuf or libbuf with any sort of alignment directive or tried<br>
to make it come first in its segment (toybuf and libbuf are the fifth and sixth<br>
globals defined in main.c), so they're both reasonably likely to straddle page<br>
boundaries anyway. Heck, I'm not even sure it's cache line aligned. The actual<br>
_guarantee_ is something like 4 bytes, except when it suddenly isn't. I fought<br>
with this in 2021 trying to get a simple "hello world" kernel out of gcc without<br>
needing a linker script: <a href="https://landley.net/notes-2021.html#12-04-2021" rel="noreferrer" target="_blank">https://landley.net/notes-2021.html#12-04-2021</a></blockquote><div><br></div><div>now you're on C11, you can easily say this: <a href="https://en.cppreference.com/w/c/language/_Alignas">https://en.cppreference.com/w/c/language/_Alignas</a><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
The 4096 is just a convenient scratch pad size. I use sizeof(toybuf) in a bunch<br>
of places... and hardwire in the knowledge of its size in a bunch of others.<br>
Plus there's a bunch of implicit "toybuf and/or this slice of it is big enough<br>
to stick this struct in, so I can safely typecast the pointer" instances I<br>
checked at the time (and all of them had a big fudge factor in case of future<br>
glibc bloat).<br>
<br>
It's really a "convenient granularity" thing. Copy loops doing byte-at-a-time<br>
stuff is known terrible because the library and syscall execution paths come to<br>
dominate, and grouping it into 4k blocks is 12 doublings of efficiency right<br>
there. Going to 64k is 1/16th as much syscalls, which is not as big a deal as<br>
1/4000th as many syscalls. And then raises the question "why not a megabyte<br>
then" which is something you don't just casually want to do on embedded devices<br>
without thinking about it (might as well malloc there)...<br>
<br>
I could probably be talked into bumping it up to 64k if somebody measured<br>
numbers saying it would help something specific? </blockquote><div><br></div><div>i think the time i noticed this was when i was looking into "where the time went" and noticed that a 64KiB buffer was quite helpful, at least on the scale of "an entire Android build" type of thing.</div><div><br></div><div>is it _worth_ it? don't know. what's the _optimal_ size? don't know. (and probably depends on the specific toy, and 4096 is clearly a sensible _lower_ bound...)</div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Triaging all the existing users<br>
isn't that big a deal. The linux pipe buffer plumbing changed to collate stuff<br>
so there's some internal copying larger granularity output might help that<br>
wasn't the case 10 years ago... but then we get back into the "output piped to<br>
less displays nothing for 3 minutes, and then it's a screenfull" issue. Line<br>
buffered output is usually like ~60 bytes at a time, 4k is bigger than most<br>
whole text screens (ok, maybe half one of yours but still ballpark). And we can<br>
just as easily malloc a bigger scratch buffer as needed in any case where it<br>
matters...<br>
<br>
Rob<br>
_______________________________________________<br>
Toybox mailing list<br>
<a href="mailto:Toybox@lists.landley.net" target="_blank">Toybox@lists.landley.net</a><br>
<a href="http://lists.landley.net/listinfo.cgi/toybox-landley.net" rel="noreferrer" target="_blank">http://lists.landley.net/listinfo.cgi/toybox-landley.net</a><br>
</blockquote></div></div>