[Toybox] PM: code style, was: Re: New Subscriber
Frank Bergmann
toybox at tuxad.com
Wed Feb 8 00:00:10 PST 2012
Hi.
On Tue, Feb 07, 2012 at 07:26:22PM -0600, Rob Landley wrote:
> But in this case, I plan to continue using the printf() family because
> doing without it wouldn't actually simplify the code, I'd just wind up
> writing something just as bad.
In this case you may use more control to use printf as desired. I'd
made a clone pf toybox and busybox and issued one command to compare:
$ grep -r setvbuf busybox|wc -l
9
$ grep -r setvbuf toybox-cloned-1328632507|wc -l
0
> C and string handling are not a good mix. String handling is the thing
> C is weakest at, because string handling actually turns out to be a hard
> problem to get right at the hardware level.
We all know that C actually doesn't know anything about "strings". ;-)
Writing "bigger" software it *may* be worth implementing strings as a
class in C (er... struct) like Wietse Venema does. But this means to
rewrite all code.
> Yeah, but after 40 years of being grandfathered in, it's still useful
> enough to stick around.
Yeah, but 99% of software out in the wild breaks the basic rules KISS and
YAGNI and introduce (sometimes many) bugs and holes with this.
> I run things under strace rather a lot, but even if it boiled down to
> doing a for() loop around write(1, &char, 1) I'd tell you to fix your
> libc rather than change what I was doing.
OK, this self-written printf-implementation you are stracing is not very
well optimized. ;-)
> I'm not that interested in micromanaging something that most likely
> stays in L1 cache either way.
Hmmm... if your code fits in L1-cache right before doing a sysenter then
the cache will be dirty when doing the call of sysenter, isn't it?
I never measured applications touching this "problem" mostly because of
that causes:
- hard to measure L1 cache running an OS with many tasks and not many
cores
- Loops already fit in L1 cache and did not call code "outside", running
so fast that you can't measure
- reducing the amount of syscalls brought most speedup, other changes were
only ambiguous measurable
- too many syscalls which can't be reduced let the advantage of the cache
vanish (mostly showing big i/o waits in top at the core the application
runs)
- some causes where further optimizations didn't make any sense (due to
e.g. network latencies)
As you wrote: You'll have to measure it (all). Until then you must keep
caches in mind.
> That's the worst you've seen?
Yes, I didn't expect it on rrdtool. But after stracing it I understand the
cause why it is actually running slow even on big hosts with many
rrd-updates even though fadvise/madvise should catch these cases.
> Never run strace on gcc. Certainly not
gcc is one of the tools I never *wanted* to strace because I already
expected a nightmare (other tools are e.g. php). ;-)
> See lib/lib.h
clone done. Patches submit to the list?
My first make did throw the nasty "dereferencing type-punned pointer will
break strict-aliasing rules". In sort.c you use TT.lines as char* and not
char**.
> command line stuff. It's only a win if you never have to do it more
> than once.
That's why I often used small internal output buffers and the nasty
stpcpy.
> Minimal system bootstrapping is theoretically four things:
BTW - I've read that pivot_root doesn't have a high priority in your
TODO-list. It's very easy to implement cause Linux offers a syscall. Older
glibc didn't offer a wrapping but this is also easy to check (and to
implement if necessary). I want to write it as my first toy-code if no one
else is working on it.
Next thing could be mount even though it's not that easy.
> Or you could do toys/cat.c using the global toybuf[4096] which is part
> of the bss and only gets its page faulted in if it's actually dirtied.
> (Modulo alignment considerations I haven't bothered about.)
I've read your docs but know I did also the clone and read some code. :-)
> As I think I said in design.html, I'm replying on c99, posix-2008, and
> LP64. (If I wasn't clear enough there tell me and I'll go fix it.)
No, it is clear. It was just a big bunch of docs. I yet don't know POSIX
2008, only 2001. I think there is much more "deprecated".
> the first commands I wrote for toybox. (Actually I started it for
> busybox but left that project before it was finished, so never submitted
> it.)
Maybe they will backport some day. ;-)
> Never heard of it. I've got a strlcpy() but everybody does since
> strncpy() isn't guaranteed to null terminate the output.)
Better you forget this nasty thing. The man-page says that it is a
GNU-extension and that it maybe goes back to the old msdos times...
> Huh, apparently it's not an extension, it's in SUSv4:
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
er... maybe this is a april fools day joke
> That's really cool. Thanks. I wonder where that's needed in lib/* or
Don't forget the name "msdos" in its history. ;-)
> Did you read http://landley.net/toybox/design.html yet? Do I need to
> fluff that out a bit?
Sorry, it was a bunch of docs. First I read the stuff easily linked and
then the urls you posted.
> all the same darn thing just like ANSI/ISO C is the same standard
> approved by two standards bodies...)
Yes, we should be glad there is not a DIN standard yet. ;-)
> Trust me: I know how to profile stuff, and how to understand the
I do. Before I read your opinion about tumb programmers I already tried to
think so.
My experiences are mainly the results of writing some monitoring tools
which sometimes can cause i/o wait, or writing a fast fgrep where I
measured that the size of the buffer is a great killer if you want to
speed it up. :-)
> Which is why they changed it so gettimeofday() can just read an atomic
> variable out of the vsyscall page:
Yes, I know this. But even if gettimeofday is not a "young" call there are
many, many projects which didn't recognize.
I still use it in some of my tools but only one time and not many times
and even more not many times a second.
> I'm totally aware that most existing userspace software is crap:
>
> http://lwn.net/Articles/192214/
Bookmarked. And - yes! - stat calls are the next "evil" calls which are
way too many called. Like times it is also a problem in the tool mentioned
above (not gcc but the other ;-) ).
Frank
--
EDV Frank Bergmann Tel. 05221-9249753
LPIC-3 Linux Professional Fax 05221-9249754
Pödinghauser Str. 5 email iservice at tuxad.com
32051 Herford USt-IdNr DE237314606
More information about the Toybox
mailing list