[Toybox] PM: code style, was: Re: New Subscriber

Sat Feb 11 12:28:15 PST 2012

On 02/09/2012 02:06 AM, Frank Bergmann wrote:
> Hi.
> 
> On Wed, Feb 08, 2012 at 11:59:43AM -0600, Rob Landley wrote:
>> Somebody Else's Problem. I'd rather not have "libc's stdio sucks" on my
> 
> Just in case that you maybe misunderstood me: I'm trying to say that you
> don't use the full power of stdio. Maybe you could enable at least line
> buffering and read whole lines with scanf.

Line input is something I mean to revisit.

One of the problems is the terminator isn't always \n, when you do "find
. -print0 | xargs 0" the terminator is null bytes.  This is actually
extremely useful: it avoids all the horrible whitespace parsing issues
where filenames can have newlines in them.  (There are only two
characters filenames can't include: NUL and / (the directory separator),
anything else is fair game in Linux, and I think in posix.)

> Maybe it can substitue reading
> one char at a time. I don't know, I'll have to read the stdio docs again.
> I also don't know current implementations in C-libs.

The downside of reading one char at a time is it's slow. One of the
biggest issues of getting good performance is having reasonable
transaction sizes, so you amortize the per-transaction overhead. (What
counts as "reasonable" is a big performance tuning issue, but page size
is a good sweet spot processors are tuned for.)

>> The actual cache lines faulted in are another matter, but if you make a
>> lot of syscalls at least the entry point tends to stay in L1, and beyond
> 
> Ah! This is the answer. I thought that at least the entry point would be
> seldom in l1 cache.

It's as hot as hot paths get, it tends to stick around unless you're
going out of your way to avoid it.

>> CPU cycle counter.  Run it on a quiescent system and see how many cycles
>> it took.  (The phrase to google for is "linux microbenchmark" or
>> something like that.)
> 
> I know.
> On all projects running on 32 but Intel I have this line in the central .h
> file:
> #define rdtscl(low) __asm__ __volatile__ ("rdtsc" : "=a" (low) : : "edx")
> But you still have to do many test runs to get useful results on a system
> with many tasks.

There was a marvelous thread on linux-kernel a few years ago about
microbenchmarks _shifting_ between intel processor revisions. This was
before the 64 bit stuff came out, they were looking at Pentium, Pentium
Pro, PII, PIII, P4, and Athlon.  (And the P4 was a giant 'island of
suck' outlier in pretty much everything.)

My takeaway from it was "don't microoptimize". Keep things like cache
behavior and L1 vs L2 vs DRAM in mind on a general level, know what the
_options_ are and try not to trip over anything obvious.

Here's another one of those yin/yang cycle things: processors doing
speculative execution and register renaming and such to keep multiple
execution units busy are much happier with:

  x = 42;
  if (blah) x = 37;

Than with:

  x = blah ? 42 : 37.

Because _with_ multiple register profiles they can copy the register
profile, speculatively perform the assignment _this_ clock cycle in one
copy and leave the other alone, and then either save or discard the
results when they get the results back from the test.  This lets them do
more stuff in parallel, and thus go faster.

It also eats power performing computations they ain't gonna keep, so the
processors that are worried about the power to performance ratio instead
of the price to performance ratio (best bang for the watt instead of for
the dollar) get better performance by not wasting energy on work they
don't know if they need yet.

So if you optimize too much for x86-64, you may actually slow _down_ arm.

In reality, your compiler's optimizer will happily turn "x = blah ? a :
b" into "x = a; if (blah) x = b" if that's what's best for this
architecture.  I try not to get in its way if it feels like doing that.

>> You have to understand what the system is doing.  You also have to
>> realize that different hardware works in different ways.
> 
> Running my software on different embedded platforms and getting
> immediately errors like segfaults (i.e. on Linux/atmel) teached me some
> experiences. ;-)

Ah, alignment issues.

This really should have been called "why cross compiling sucks":

http://landley.net/writing/docs/cross-compiling.html

>> The horror is indescribable.  But I tried in my blog...
> 
> I already thought of you as a fan of horror movies after the last post. ;-)

Not really.  I get enough and feel no need to see it _out_.

I _am_ a fan of movies like "aliens" and "terminator II" though, which
support my theory that the main difference between an action movie and a
horror movie is how heavily armed the protagonist is.

>> I hate that error. There is nothing WRONG with type-punning a pointer,
> 
> One of the things programmers like is to satisfy compilers. ;-)

Nah, I occasionally humor them at best.

But if the compiler is beign stupid,the compiler is being stupid.

>> my life.  (I don't care about the performance change, IT'S VALID C!)
> 
> Are you sure? For K&R I'm sure that it is but IMHO c99 "requires" the
> usage of void in this case (using char** as char*).

Can you point me to where in the spec?

http://landley.net/c99-draft.html

I note that the latest variant is available here:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf

Apparently they're still plinking away at it.  But A) if they changed
anything my compiler won't do it, I started caring about c99 around 2005
when compiler vendors had a chance to catch up, B) it's PDF which isn't
very searchable.

>>>> command line stuff.  It's only a win if you never have to do it more
>>>> than once.
>>>
>>> That's why I often used small internal output buffers and the nasty
>>> stpcpy.
>>
>> What's nasty?  It's in POSIX 2008.
> 
> You're not the only one who was surprised because of that.
> ("nasty" because I thought it was a GNU extension of the lib with msdos in
> history.)

And bzero() is obsolete but does what I need.  I have a portability.h
header that may turn into a portability.c file that implements things in
various environments that don't have them.

>> The internal implementation of that syscall is disgusting, because it
>> has to examine and potentially modify the state of every process on the
>> system.
> 
> Sure but in i.e. initrd it means mostly two or three processes.

The vanilla containers control package lxc uses pivot_root() to do its
chroot because the chroot() syscall is flawed:

  http://yarchive.net/comp/linux/pivot_root.html

Which is where:

  http://landley.net/notes-2011.html#02-06-2011

Came from.  (Yes, Linus Torvalds personally had to explain how the guts
of chroot() worked when I was writing switch_root for busybox, because
apparently nobody else _knew_.)

> Of course
> it is disgusting and only useful in rare cirumstances but we have a
> syscall and we are writing userspace software. :-)

Or if you're working around flaws in chroot() and don't feel like
properly _fixing_ it.  (Which is still on my todo list...)

>>   http://landley.net/notes-2011.html#02-06-2011
> 
> http://www.tuxad.com/ngtx/ngtx-current/tools/breakout.asm
> :-)
> (Of course you need the capability to break out.)
> 
>> Note that adjusting the process-local mount tree wasn't possible until
>> A) there was a process-local mount tree, B) --bind mounts had been
>> invented so you can split a mount point.
> 
> *shudder* some weeks ago I had to deal with some big servers with dozens
> of containers and even more bindmounts and kernel of the early 2.6.2x
> series with no full bindmount support.
> 
> But what I doesn't understand right now: Are you fighting against
> pivot_root? I just mentioned it as a starting point for me.

I'm not fighting against it, I'm saying I want to _fix_ chroot now that
we have the infrastructure for it to really do what it seems like it
would do.

>> I have plans for that one and would like to do it myself.
> 
> unmount is left. ;-) I'll peek into the list again.

I need to update http://elinux.org/Busybox_replacement but what I
_really_ need to do is take the big master todo list at
http://elinux.org/Busybox_replacement#Command_List and break it up the
way I did http://landley.net/toybox/todos/susv4.txt into "low hanging
fruit", "medium hanging fruit", and "fiddly" commands.

With more explanation of _why_ each medium/fiddly command isn't
low-hanging.  (Also, the low/medium ones are what it would take _me_ to
do them: mount and sed are easier for me because I already implemented
those before.  Each one took about three months to figure out what I was
doing...)

>> Note that the common thing about all three of those?  Available free on
>> the web.  If it's not available free on the web, IT ISN'T A STANDARD.
> 
> ... and it is expensive. ;-)
> 
>> Nope, it makes sense.  The implementation is trivial:
> 
> Of course. Before I "detected" it I'd written my own function. But like
> you I searched for a "standard function".

Eh, when I first learned C I wrote my own strlcpy() because I needed it.
This was on Turbo C for DOS, _years_ before I'd heard of BSD.  (Whether
the "l" stood for "length" or "landley", I never had to come down on one
side or the other of...)

>>   http://landley.net/history/mirror
> 
> Bookmarked. I'm glad to see that you did mention "Space Travel" (not Space
> Wars like Linus wrote). :-)

He was conflating "sky" and the PDP-1 thing Slug Russell wrote.  Both of
which are important in their own way...

>> Fabrice Bellard (the creator of tinycc and qemu) wrote i386 emulator,
> 
> He is already carved in stone in the history of computers. :-)
> 
>> You said that printf() violated an early unix maxim, but there's another
>> one: "When in doubt, use brute force".  Implement, _then_ optimize.
> 
> It was MAYBE breaking a rule by Doug McIlroy! ;-)
> 
> BTW - if you're interested in computer history: Do you have a (valid)
> source for this Kernighan citate? In the Bell Labs documents I did not
> found it.

That's because it was Ken Thompson, not Kernighan.

http://catb.org/~esr/writings/taoup/html/ch01s06.html#id2877917

(It was also mentioned in The Unix Philosophy by Mike Gancarz, but I was
there when Eric Raymond _opened_ the mail package containing the draft
manuscript of The Art of Unix Programming which Ken Thompson had
handwritten his review notes on. We didn't have incense or anything, but
it was definitely treated as a holy relic...)

That bit about "One of my most productive days was throwing away 1000
lines of code." was ken's hand-written citation, in red pen, as a margin
comment to the section that now quotes it.

Rob

 1328992095.0