[Toybox] PM: code style, was: Re: New Subscriber

Rob Landley rob at landley.net
Mon Feb 6 04:29:02 PST 2012


On 02/05/2012 04:31 PM, Frank Bergmann wrote:
> er... forwarded to the list.
> 
> On Sun, Feb 05, 2012 at 11:28:39PM +0100, Frank Bergmann wrote:
>> Hi Rob,
>>
>> On Sun, Feb 05, 2012 at 12:07:57PM -0600, Rob Landley wrote:
>>>> "uid=%u(%s) gid=%u(%s)\n", <real user ID>, <user-name>,
>>>>    <real group ID>, <group-name>
>> er... I remember now. :-(

That the real world has complex output formats, or that id's output
looks like that?

>>> above output sequence for "id" with no arguments would be 9 separate
>>> output statements without printf.  (Or 9 statements assembling a string,
>>> of who knows what length, and then a tenth to write it out.)
>>
>> Yes, this is a source code blow-up. But using your own calls it can also
>> be easily readable. I remember such sequences of statements in the source
>> of fnord (a fast and small webserver). This is an excerpt:
>>       buffer_puts(buffer_1,"http://");
>>       buffer_puts(buffer_1,host);
>>       buffer_puts(buffer_1,"/");
>>       buffer_puts(buffer_1,url);
>>       buffer_puts(buffer_1,"/\r\n\r\n");

I fail to see how that's an improvement.

Replacing one function call, which fits on maybe two lines, with half a
screen full of function calls (at 80x25 terminal size) means less code
fits on the screen at once so you've got a lot more scrolling to see the
same amount of program in front of you. When the function call in
question is part of the C standard and the replacement isn't...

And if you really really care how many write function calls:

  sprintf(buffer_1, "http://%s/%s/\r\n\r\n", host, url);

Or xmsprintf() in lib/lib.c will xmalloc() the appropriate amount for you.

>> You know what it means and you get just one syscall. Using printf() it is
>> very hard to guess, how many syscalls one statement will use.

And that's a "performance over simplicity" mis-optimization.  I only
care how many system calls it is when it's a hot path, and the ascii
FILE structure has a built-in buffer to batch that stuff up a bit.

>>> Keep in mind that my primary design goal is _simplicity_, then size,
>>> speed, and features. You have to trade these off against each other when
>>
>> Yes, that was my question: Which kind of simplicity. ;-)

Well, for one thing the above functions aren't in the standard C
library, and the ones I'm using are.

I've been wrapping a number of standard library calls (heck,there's an
xprintf()) but using simple rules like "anything that starts with an x
will perror_exit() if it does not succeed, so you never have to check
the return value for failure".  My largest divergence there is ditching
getopt(), but that's because my new infrastructure does all the work for
each command before it even runs, and getopt() required the command to
have a switch statement.

With your calls above, presumably you have a reason for not using
"strcat()" instead.  Your buffer has a length but I don't know what it
is.  Does it automatically flush itself or is there an implicit "now
write all this out" that you didn't include, and if so is filling the
buffer an error or an auto-flush that's going to do an implicit system
call anyway?

What's handling write failures, both from the other end prematurely
hanging up and from "sigstop" causing short writes, possibly zero
length, which you have to check errno to distinguish the sort write from
EOF?  And if you don't get the short writes correct then suddenly "tar c
blah | gzip | ssh" suddenly corrupts the tarball if you ctrl-z and then
fg the pipeline...

Yes, toybox has a lot of implicit knowledge like that too, such as
toybuf being page sized (4k, although not necessarily page aligned, but
that's still an amount of data in a single transaction that caches tend
to be tuned for), and us relying on sigaction(SA_RESTART) to avoid zero
length read/write problems having to check errno everywhere (which isn't
fully implemented yet but is the goal).  I know what it all gets us, and
have tried to document most of it.  I'm not sure what this "I don't like
printf, let me open-code every occurrence of it" gets you.  Why not
write a better printf?

(P.S. if you look at echo, xprintf(), xputc(), and xflush() all exist
because "echo > /dev/full" is supposed to return with a non-zero error
code.  This is the sort of detail I'm trying to get right.)

>>> And I want to let you build individual commands and have _those_ be as
>>> small as possible, but in doing so I assume those will be dynamically
>>> linked.  I am not optimizing for the "multiple individual executables,
>>
>> Even if you replace some "evil" libcalls of stdio with your own (internal)
>> lib you can still make code more small.

You mean like the xblah() wrappers so I avoid testing return codes, or
the way xsmprintf() is doing mostly the same thing asprintf() is but
that's one of the gnu/dammit extensions that I didn't want to rely on?
(And really, I mean to replace getline() too but that brings us to the
"must write line editing with command history for both the shell and vi,
and yes SUSv4 specifies vi".)

>> I always test my binaries after
>> big changes with different clibs and statically/dynamically and screw them
>> up with strace and/or ltrace. That's not every day business but doing so
>> gives you sometimes surprising results. :-)

Notice the prebuilt strace binaries for a bunch of different hardware
targets at http;//landley.net/aboriginal/bin because yes, strace is
extremely useful. I don't like to look at the gnu tool source to see how
they implemented anything (*shudder*), but I do occasionally run the
command my distro comes with under strace.

But really, "minimal number of syscalls" is speed to me, not simplicity.
Hello world is simple code, if the libc it's linked against chooses to
do a separate syscall for every character of output that's libc's
problem. If performance sucks I'll look into it, but I expect some
variation from libc to libc and if it sucks _fix_your_libc_ or use a
better one.

>>> Never heard of either, but in general:
>>> The external dependencies of BusyBox (under my tenure):
>>>   libc
>>> The external dependencies of toybox:
>>>   libc
>>
>> These libs are meant for internal usage but they are meant only as
>> examples. In my tools I replaced much stdio-stuff with own "lib" routines
>> and still have only libc as dependency. This lowers side effects when
>> people use very different c-libs.

I'm coding for Linux, and now android. On Linux, klibc is laughable,
dietlibc is broken, uClibc I'm testing against, the musl developers
presumably fix their stuff when they find something wrong with it, and
bionic I expect to have to work around but that's with some sort of
lib/makeitstop.c that implements things like "printf" according to the
darn standard.

I have a vague academic interest in MacOS X but I've already got mdev
and unshare and last time I wrote mount it had --bind and --move and
such, and used /proc/mounts instead of the obsolete /etc/mtab, all of
which are very linux specific.

>>> if (flags & (FLAG_u|FLAG_g))
>>>   printf("%d\n", (flags & FLAG_u) ? uid | gid);
>>> else printf("%d %d\n", uid, gid);
>>
>> What about the syscalls? ;-)

Presumably it will make them for me.

Syscalls under linux are way way way cheaper than under crap like
Slowaris. The whole kernel's backed by a hugepage with a permanent TLB
entry so the transition's basically free from a cache standpoint, they
upshifted from int 80 to SYSENTER to save a couple clock cycles, and
this is 10 years _after_ doing optimizations like
http://cryptnet.net/mirrors/texts/kissedagirl.html that other operating
systems just ignored...

This is the kind of performance optimization work they were doing 10
years ago:

  http://kerneltrap.org/node/384

And these days, most Linux system calls don't even cause a context
switch (I.E. cache flush and page table walk):

http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html

That has gettid() at 100 nanoseconds or less...

Rob



More information about the Toybox mailing list