[Toybox] PM: code style, was: Re: New Subscriber

Frank Bergmann toybox at tuxad.com
Mon Feb 6 07:31:08 PST 2012


Hello,

On Mon, Feb 06, 2012 at 06:29:02AM -0600, Rob Landley wrote:
> That the real world has complex output formats, or that id's output
> looks like that?

... that id's output should look like that (regarding POSIX). I know the
differences between POSIX and i.e. GNU only as described in man-page
sections 2 and 3. :-)

> I fail to see how that's an improvement.

- small code size
- faster execution due to reduced overhead
- often reduced stack size
- sometimes reduce amount of syscalls

We should not start a "printf-flame-war". I just wanted to talk about some
experiences I've made (and Fefe and some others, too).
IMHO printf() is actually a wonder of existence because it breaks with
Doug McIlroy's great recommendation. ;-)

[...]

Guessing the number of syscalls a libcall produces at runtime should not
be done in a "released package" but while development. Remember your links
and the article about the tumb programmer? :-) I don't know what printf()
is actually doing on c-lib abc or c-lib xyz. To get a little bit more
control there are libcalls like setvbuf() which I rarely found in code
using stdio out in the wild. Funny thing, isn't it? ;-)

> And that's a "performance over simplicity" mis-optimization.  I only
> care how many system calls it is when it's a hot path, and the ascii
> FILE structure has a built-in buffer to batch that stuff up a bit.

Of course, taking printf as example. But you should always keep it in mind.
The worst thing I've seen was a trace of rrd-update on a Debian system.
More than 300 Syscalls due to dynamic loading and searching paths. After
that one or two fadvise/madvise-calls and then the write() itself. I
wonder if this binary was ever measured. :-)

> Well, for one thing the above functions aren't in the standard C
> library, and the ones I'm using are.
[...]
> the return value for failure".  My largest divergence there is ditching
> getopt(), but that's because my new infrastructure does all the work for
> each command before it even runs, and getopt() required the command to
> have a switch statement.

This is the thing I was talking / asking about: Which functions are
provided by the internal lib? Do you have any plans to use a special
and license-compatible library like libowfat (this is one I remember) or
do you write your own basic functions (like I do for some years)?

> With your calls above, presumably you have a reason for not using
> "strcat()" instead.  Your buffer has a length but I don't know what it
> is.  Does it automatically flush itself or is there an implicit "now
> write all this out" that you didn't include, and if so is filling the
> buffer an error or an auto-flush that's going to do an implicit system
> call anyway?

er... actually I don't know what fnord's buffer routines do. IMHO you
specify the size of the buffer (comparing setvbuf) and you also got a
buffer_flush (comparing fflush).
Writing something like cat you should use a buffer and libcalls managing
it. Writing something like id you may use a small linebuffer within your
main function. It should be measured.

> What's handling write failures, both from the other end prematurely
> hanging up and from "sigstop" causing short writes, possibly zero
> length, which you have to check errno to distinguish the sort write from
> EOF?  And if you don't get the short writes correct then suddenly "tar c
> blah | gzip | ssh" suddenly corrupts the tarball if you ctrl-z and then
> fg the pipeline...

I know how to handle these cases if it should be necessary but I don't
know how printf will react - even in glibc! Nice question, I should
investigate now. :-)
(BTW - that's one reason why I re-implmented some basic functions for
myself.)

> have tried to document most of it.  I'm not sure what this "I don't like
> printf, let me open-code every occurrence of it" gets you.  Why not
> write a better printf?

That was one of the questions you may find between my lines. ;-)
Do you have one or a license-compatible "foreign" implementation?
Is it worth to be done?

> You mean like the xblah() wrappers so I avoid testing return codes, or
> the way xsmprintf() is doing mostly the same thing asprintf() is but
> that's one of the gnu/dammit extensions that I didn't want to rely on?

Yes, some kind of. Another extensions which I often use is stpcpy (shame
on me). This leads to the question which interface standard should be
used? POSIX? 1996 or 2001? What about functions which are "deprecated"
(i.e. gethostbyname) or "rejected" (i.e. clearenv)?

> But really, "minimal number of syscalls" is speed to me, not simplicity.
> Hello world is simple code, if the libc it's linked against chooses to
> do a separate syscall for every character of output that's libc's
> problem. If performance sucks I'll look into it, but I expect some
> variation from libc to libc and if it sucks _fix_your_libc_ or use a
> better one.

OK. Some foreign lib bashing in source code comments is always nice to
read. ;-)

> I have a vague academic interest in MacOS X but I've already got mdev

It's the only OS I know where you can switch off ptrace! ;-)
Its BSD-API is sometimes frustrating.

> upshifted from int 80 to SYSENTER to save a couple clock cycles, and

Don't talk about Slowlaris. They NEED threads because only threads are
actually lightweight as processes in Linux are. But even with sysenter
which is IMHO used on every linux system today you should still keep in
mind that syscalls are the most expensive calls.

> http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html
> 
> That has gettid() at 100 nanoseconds or less...

Wow! This is really low. But out in the wild you found crappy software
(IMHO more than 90%) which does hundreds of time()-calls a second. And
sometimes when Nagios warns about raising contextswitches on a specific
host you detect software with an strace you would never dream of - even if
you dream of Freddy Kruger all night long. ;-)

Rob, you're making me hungry. I should co/clone the current toybox stuff.
:-)

Frank

-- 
EDV Frank Bergmann                           Tel.     05221-9249753
LPIC-3 Linux Professional                    Fax      05221-9249754
Pödinghauser Str. 5                          email    iservice at tuxad.com
32051 Herford                                USt-IdNr DE237314606



More information about the Toybox mailing list