[Toybox] [PATCH] Factor out dhcpd's and syslogd's main loops into a library function

Fri Sep 6 15:35:27 PDT 2013

Rob Landley wrote:
> On 09/01/2013 08:54:27 AM, Felix Janda wrote:
> > Both syslogd and dhcpd wait for sockets, signals or a timeout.  
> > Currently,
> > they use select() with its timeout to listen to the sockets and a
> > self-pipe for the signals.
> > 
> > The new library function instead uses pselect() without a timeout and
> > alarm(). With pselect() signals are only unblocked during this system
> > call; so the xwrappers don't need to be aware about interrupted  
> > syscalls.
> 
> Are we normally blocking a lot of signals in other commands? Using the  
> 6 argument version instead of the 5 argument version addresses an issue  
> for which existing command?

If a signal occurs during the block it will still be delivered in the next
pselect loop. pselect implements exactly the signal unblocking I am talking
about above. We block all signals. But during the signal call some are
unblocked. When it returns they are again blocked and we can safely check
what has happened. See the article

https://lwn.net/Articles/176911/

for some discussion on the system call.

> > alarm() is used instead of the timeout argument to pselect() so that
> > the length of the intervals between timeouts is not affected by being
> > interrupted.
> 
> QEMU recently changed its timer system to _not_ use alarm:
> 
>    http://blog.alex.org.uk/2013/08/24/changes-to-qemus-timer-system/
> 
> I'm not convinced sending signals to ourselves is preferable to telling  
> the system call when to timeout. Of the three options (select adjusting  
> its timespec, us manually reading the clock and feeding pselect a  
> timeout based on when we next want to wake up, and us sending ourselves  
> an alarm), us sending ourselves an alarm and constantly having to reset  
> it seems like it has the most moving parts and the most things that can  
> go wrong.

select() adjusts its timespecs on linux. But that's only one possible
behavior specified by POSIX. Since we are already doing signals SIGALRM
seemed more elegant then constantly adjusting the timeout to me.

> Ok, maybe I've been slightly burned by SIGALRM before:
> 
>    http://landley.net/notes-2011.html#05-09-2011
> 
> > On SIGHUP syslogd restarts. Previously everything relevant was  
> > contained
> > in syslogd_main() and the restarting was implemented by a goto. Now a
> > longjmp() would be necessary. I instead used an xexec().
> 
> Since commit 955 (and pondering toysh to finally tackle that sucker  
> within a finite amount of time) I'm a bit worried that restarting a  
> command without OS involvement might misbehave if the signal mask gets  
> corrupted, or there are funky environment variables, or memory leaks  
> build up, or we have leftover filehandles, or a couple dozen other  
> things.
> 
> We haven't got "generic reset this process's state" code, longjmp()  
> leaves a lot of debris, and xexec() callign a new main doesn't clear  
> the old stuff off the stack; do that in a tight enough loop for long  
> enough and eventually you will run out of stack even in a normal linux  
> userspace. (Default's 8 megs: ulimit -s says 8192 and the bash man page  
> says the units are kilobytes.)
> 
> I added NOFORK because some commands (like cd) _must_ happen in the  
> current toysh process context, but also because I could very very  
> carefully audit commands to make sure it was safe to call them in the  
> same process context and continue when they return. But these commands  
> are in pending because I haven't even done an initial audit pass on  
> them.
> 
> Discarding your process on restart and execing a new one covers a  
> multitude of sins. Possibly syslogd is clean enough it can restart  
> itself without accumulating trash, but as a long-lived daemon I have  
> yet to closely read (hence it being in the pending directory), this  
> does not fill me with confidence.
> 
> Probably just the "it's 5am, sun coming up soon" things. I should step  
> away from the keyboard...

You are of course right. I think you have some good ideas below how to
fix it.

> > Since xexec()
> > doesn't change the pid the pidfile needs to be unlinked() before
> > terminating so that xpidfile() doesn't complain.
> 
> Um... I agree this is a symptom of a problem. I'm not sure that's the  
> right fix. The restart is maintaining state and then getting confused  
> by the state it's maintaining...
> 
> (I can just see some horrible systemd thing hitting a race window where  
> the PID file wasn't there and trying to restart the process...)

Does systemd still need PID files?

> > Furthermore if not run
> > in the foreground syslogd will daemonize() again when restarting.  
> > Could
> > one detect in daemonize() whether daemonizing is necessary?
> 
> "The restart is maintaining state and then getting..."
> 
> Are these the _only_ instances of this? Is a main loop that exits and  
> restarts itself but avoids initial setup better?
> 
> Is there a clean way to do this where we _don't_ have to be absolutely  
> sure we've thought of everything ever? execv("/proc/self/exe",  
> toys.argc) and then fall back to something else if it returns, perhaps?  
> Or can we get an exhaustive list of all process resources (from the  
> container suspend/resume guys, presumably) and check for ourselves that  
> we're not leaking anything?

So how about making the two helper functions for daemon_main_loop()
return a value and possibly returning from daemon_main_loop() based on
that value? Then in syslogd_main() add a infinite loop around (or almost)
everything.

> Very old concern of mine, a full decade before containers:
> 
>    http://lkml.indiana.edu/hypermail/linux/kernel/0206.2/0835.html
> 
> > Felix
> > 
> > # HG changeset patch
> > # User Felix Janda <felix.janda at posteo.de>
> > # Date 1378041908 -7200
> > # Node ID 752d22ece0fe3fa9763fd0b6e231e653cc173890
> > # Parent  587b7572aac2a64988503458155e1f2546672525
> > Factor out daemon select() loops into a library function
> 
> Ok, the new select() lib function I'd probably be up for reviewing on  
> its own, and the cleanups to a function I haven't even read yet I'd  
> just pass through and read/cleanup the new baseline when I got around  
> to it.

This patch really just implements the helper function and makes the two
toys use it. It's just a big function, causes the big main functions to
be split up and makes the self-pipe stuff unnecessary.

> But both at once? Not up for that right now. Might try again in the  
> morning...

I wouldn't consider syslogd to be of high priority.

On the other hand I'd like (or maybe would have liked some days ago)
to continue with cleaning it up. Now that the main function is not such
a monster anymore one can maybe inline more stuff. OTOH I'm still not
sure about the behavior when something goes wrong when the toy starts.

If we come to an conclusion here, I'll send you a new version of the
patch, right?

Felix

 1378506927.0