[Toybox] fun with vfork

Rob Landley rob at landley.net
Thu Oct 13 10:08:51 PDT 2016


On 10/13/2016 03:25 AM, Josh Gao wrote:
> On Wed, Oct 12, 2016 at 5:49 PM, Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
> 
>     So... no? I think? Is there a way _I_ can tag this? (I can't do my own
>     vfork prototype because I can't #undef the one I get out of unistd.h and
>     that's a fairly generic header. It's sad I can't redo function
>     prototypes after the fact, but the language never gave me a way to.
>     Maybe I could do a gratuitous wrapper around it?)
> 
> 
> gcc seems to do the equivalent automatically for any function named
> "vfork": https://github.com/gcc-mirror/gcc/blob/e11be3ea01eaf8acd8cd86d3f9c427621b64e6b4/gcc/calls.c#L533

If so, it's not working.

> On Thu, Oct 13, 2016 at 1:03 AM, Rob Landley <rob at landley.net
> <mailto:rob at landley.net>> wrote:
> 
>     If I have to gratuitously call setjmp() and ignore its return value
>     right before calling vfork() to beat reliable behavior out of gcc, I can
>     do that. I can also use global variables instead of local variables, or
>     make a structure of local variables so gcc can't gratuitously reorganize
>     them and trim the stack, or have my one allowed <strike>phone</strike>
>     function call be to a function I define that contains "everything the
>     child does" to preserve the stack context.
> 
>     Personally, I'd rather the compiler didn't fight me when I'm trying to
>     do something obvious, but I have LOTS of ways to fight back. :)
> 
>  
> It doesn't seem like gcc differentiates between vfork, setjmp, etc. so
> it's presumably providing some behavior that satisfies the constraints
> of all of them (or there's a bug).

Given how few gcc developers do nommu development, I'm amazed there
isn't even more bit-rot here.

> The specification for longjmp says
> that only non-volatile local variables that get modified have
> unspecified values, so you could maybe try sprinkling volatile on things
> to see if it makes your problem go away?

That's not an approach I'm comfortable with.

Rich and I were arguing about this on IRC. I tend to respond to this
sort of thing with research, trying to understand what the system is
actually doing under the covers and what behavior the kernel provides
(and whether it falls under the "stable ABI" guarantees), and then build
up from there. If I have to call clone() myself to GET a stable
guaranteed ABI, I'm ok with that. Dig down, find the problem, and fix it.

But ever since C++ developers took over compiler development and started
rewriting C compilers in C++, they've tried to turn C into C++. C is a
portable assembly language with minimal abstraction between what your
program says and what the machine does. C++ is a giant pile of magic
behavior handed down from the gods and only understandable to the
sufficiently initiated so there's piles of "you must do this, ask not
why" all over the place.

Because of this the compiler "optimizing" away a memset() so your
encryption keys stay in memory, or saying you can't typecast and
dereference something without a gratuitous intermediate variable, is now
expected, and the range of such unclean behavior is ever-expanding.

Before they started rewriting compilers in C++? Not so much. Bits of
modern compilers are untrustworthy, and that is a PROBLEM, not something
to just accept. I want to hit them with a large enough brick to make
them behave. Disabling entire categories of optimization come to mind,
although I haven't found a way to do that here because I depend on dead
code elimination.

I don't like sprinkling typecasts around to fix problems. "volatile"
falls under the same heading: the kernel guys use memory barriers
instead of volatile because volatile is almost meaningless.

Rich is trying to convince me to use posix_spawn(), a function so
obscure Ubuntu 14.04 hasn't even got a man page for it:

  $ man posix_
  posix_fadvise    posix_fallocate  posix_memalign   posix_openpt
  $ man posix_spawn
  No manual entry for posix_spawn

His argument seems to be that posix_spawn() is magic that can't be
implemented in C, which is odd because musl-libc is implemented in C. I
just checked and under the covers, posix_spawn() is calling clone()
itself, with the normal VFORK flags but providing its own (1024 byte) stack.

*shrug* I can do that. Or I could do my "one function call" approach so
the compiler's liveness analysis of local variables doesn't come into
play. (I HAVE to be able to do one function call after vfork with
semi-arbitrary complexity IN that function call, or I couldn't do
execlp(), which is implemented as a loop trying all the $PATH locations.
That counts as "vfork and exec". So the logical next bigger hammer is
"wrapper function". Of course to avoid the compiler screwing THAT up I
have to disable automatic inlining even in LTO mode, which is the next
round of C++ developers proving C is no better than C++ by breaking it
and going "see!?!??!". Possibly dereferencing a function pointer is
sufficient to force this, let's see hwo AI-complete their optimizer is.
If I have to do math on the function pointer, I can. But switching the
variables in question to globals needs a smaller comment.)

But feeding "hints" to the compiler ala "register" or "inline" that the
compiler is free to ignore, and which butterfly-effect a magic brownian
motion generator that changes version to version? Not a fan. And neither
are the kernel devs:

  https://lwn.net/Articles/166172/

And that's before you get to "this also needs to work on llvm, and
should someday work on other compilers if things like libfirm ever wind
up mattering".

Rob


More information about the Toybox mailing list