[Toybox] Impact of global struct size

Rob Landley rob at landley.net
Thu Jan 4 18:27:48 PST 2024


On 1/4/24 18:37, enh wrote:
> On Thu, Jan 4, 2024 at 10:05 AM Rob Landley <rob at landley.net> wrote:
>>
>> On 1/3/24 12:19, Mouse wrote:
>> >> (The line between PIE and dynamic linking confuses even me.  How does
>> >> static PIE relocate itself?
>> >
>> > It may not.  It could get relocated by in-kernel ASLR or the like.
>> > Also, I think PIE isn't relevant, or certainly isn't _as_ relevant, to
>> > the final executable; my impression is that it's more important for
>> > library code, so it doesn't need fixups.  These are less important for
>> > static executables, since the fixups there happen once, at link time,
>> > whereas for a .so the fixups happen at runtime and reduce the
>> > text-segment sharing that is one of the benefits of shared objects.
>>
>> I want https://www.muppetlabs.com/~breadbox/software/tiny/teensy.html but a
>> walkthrough for the kernel's ELF loader. (I've had to walk through it MYSELF
>> several times, but I didn't do writeups afterwards so forgot it all.)
> 
> (yeah, and the one i've done for that and for the libc side of things
> were both just google-internal talks, so there's no record of them
> anywhere :-( )

I've stopped going to conferences that don't record and post the talks.

Then there's a meta-problem of INDEXING all this information. Which is what I
tried to tackle when I got the Linux Foundation documentation fellowship in
2007, but...

https://landley.net/notes-2007.html#15-11-2007

(Their Problem was Jon Rogers' old chestnut, "A goal is not a plan." I basically
finally convinced them "this is what actually needs to be done" and they went
"Huh, yeah. You're right. We're not interested in funding that." They wanted an
author and NEEDED a librarian.)

It doesn't matter if documentation exists that nobody can FIND. I'm weird in
that I spent a whole project tracking down
https://landley.net/notes-2007.html#13-10-2007 and
https://landley.net/notes-2007.html#29-09-2007 and
https://landley.net/notes-2007.html#07-09-2007 and
https://landley.net/notes-2007.html#14-06-2007 and actually READ through the
backlog of kernel-traffic and https://lwn.net/Kernel/Index/ and the linux
journal articles back when the web page had an index of them (which I could
probably fish out of archive.org if I tried...) I collected zillions of links at
https://landley.net/kdocs/ and many were links to other indexes! (They NEST!)
But it's all bit-rotted. I haven't even set up a man7.org replacement web page
builder, and that's a three day weekend's work, tops.

I'm out of the habit of speaking at conferences (there was a pandemic), really I
should just get on a regular local schedule of Posting Crap Videos To Youtube.
NOT trying to polish them but just get them out regularly and then later string
together playlists of the less bad ones. (I can blather much
stream-of-consciousness! You think this is bad, you should meet me in person!
Elliott was subject to this at a lunch once, and I was NOT sleep deprived, and
on my best behavior for that.)

https://web.archive.org/web/20130123001143/http://www.homeonthestrange.com/view.php?ID=28

(Except... not Youtube. They've gone septic. And setting up peertube is one of
those blocking todo items.)

> i've been meaning to tell you, apropos something you said on your blog
> about ARG_MAX (for xargs?), that the kernel changed how that works
> recently... see
> https://android.googlesource.com/platform/bionic/+/main/tests/unistd_test.cpp#1128
> for more detail and links.

Define "recently"? 2.6.23 was 2007.

Assuming I haven't missed something, here's from my giant dirty tree:

--- a/lib/env.c
+++ b/lib/env.c
@@ -8,14 +8,20 @@ extern char **environ;
 // Returns the number of bytes taken by the environment variables. For use
 // when calculating the maximum bytes of environment+argument data that can
 // be passed to exec for find(1) and xargs(1).
-long environ_bytes(void)
+long child_env_free(char **argv)
 {
-  long bytes = sizeof(char *);
+  struct rlimit lim;
+  long bytes = 2*sizeof(char *); // NULL array terminators for argc and envp
   char **ev;

-  for (ev = environ; *ev; ev++) bytes += sizeof(char *) + strlen(*ev) + 1;
+  // Since 2.6.25, Linux's env limit has been 1/4 stack, with 32 page minimum.
+  // sysconf(_SC_ARG_MAX) is unreliable (compile time value, not probed)

-  return bytes;
+  getrlimit(RLIMIT_STACK, &lim);
+  if (argv) for (ev = argv; *ev; ev++) bytes += sizeof(char *)+strlen(*ev)+1;
+  for (ev = environ; *ev; ev++) bytes += sizeof(char *)+strlen(*ev)+1;
+
+  return (lim.rlim_cur/4)-bytes;
 }

And my hangup on that was probably the same November 24 entry you're replying to
where trying to figure out "where DOES the argv[] and envp[] pointer array live
and does it come out of the same budget, and does anything ELSE on the stack
come out of that budget or is it the magic 2 pages in the start that everything
gets as "kernel stack"? Trying to printf("%p") the pointers wasn't as
enlightening as I'd hoped (not remotely adjacent), which led me to reading the
kernel code to try to figure out what the stack layout actually WAS, which got
sidetracked...

Basically I want to know what struct is at the end of the stack (a sequenced
collection of structs and arrays are conceptually in an encapsulating struct),
and where does "1/4 stack size" _start_ measuring from. (From the actual end, or
does some of the data there "not count"? The debian xargs behavior implies it's
_just_ measuring the strings, but if so I could feed it an argv[] of a couple
million "" and blow the stack because each of those is 8 bytes of argv[] to
point at 1 byte of NUL terminator, and resticting _that_ to 1/4 the stack would
try to write off the end of it. I'm pretty sure somebody would have noticed by
now...)

>> I suppose I should start by reading his dynamic version:
>>
>> https://www.muppetlabs.com/~breadbox/software/tiny/somewhat.html

To clarify, I DID read that at one point, but most of it went "woosh" over my
head at the time...

Rob

P.S. Documentation eats SO much energy. Somebody on mastodon said "the people
who talk about [some technical thing] aren't the people who do it" and I'm going
"Yeah, there's a REASON for that". I cited
https://web.archive.org/web/20110227151752/http://gnumonks.org/~laforge/weblog/2006/02/22/
from http://landley.net/notes-2007.html#20-11-2007 because I was TRYING to hand
that stuff off to avoid
https://www.uarts.edu/neil-gaiman-keynote-address-2012#:~:text=There%20was%20a%20day
. The downside is, you THINK you're handing off to Eben Moglen and then Bradley
Kuhn shoves him aside and files a lawsuit against Cisco that makes them
permanently end Linux development on Linksys, and taking control BACK takes...
some doing. (I _checked_ that Eben had sobered up and distanced himself from the
FSF! I forgot to ask about the people who worked for him...) Today I received an
email emailed entitled 'Some inaccuracies in the current "Understanding the bin,
sbin, usr/bin, usr/sbin Split" page' which is why
https://landley.net/writing/unixpaths.html is now 404. That page was _already_
an updated version of a more than decade old busybox post I'd made that went
viral
(https://web.archive.org/web/20160620091935/https://news.ycombinator.com/item?id=3519952
and when
https://web.archive.org/web/20160313110628/http://hackermonthly.com/issue-22.html
asked to republish it and I went "let me do some quick fixes" (the root
partition was on an internal 0.5 mb disk, rk-05 disk packs are 2.5 megs, so
"adds up to 3" was right but not that way, and here's citations to two original
posts on dennis ritche's website talking about this so you can read it yourself
if I got anything else wrong...) and the magazine sent me a PDF of my nicely
typeset article version for proofreading, and I converted that back to text so I
had an html version of the updated one for landley.net/writing... And then
somebody goes "I was there at Sun Microsystems and ACTUALLY we deserve the
credit for these four things" and I just deleted the page because I haven't got
the spoons. That's the easiest way to make it NO LONGER WRONG. If he wants to
argue with the 15 year old busybox post that was off the top of my head and had
no citations, take it up with Denys. I ALREADY feel squeamish that I gave
Lennart Pottering ideas (he CITED me in
https://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge/)...


More information about the Toybox mailing list