[Aboriginal] Aboriginal. Wow! and Thanks!

Rob Landley rob at landley.net
Sun Jun 26 10:55:01 PDT 2011


On 06/25/2011 07:12 PM, Paul Kramer wrote:
>> Every build should compile and run so as not to screw up "git
>> bisect", but calling halfway through a re-engineering of how
>> somethign works a release candidate is disingenous.  Unless you
>> force yourself to check in a month's work as One Big Lump, which is
>> bad for different reasons.
> 
> Rob -- what do you mean by  build should compile and run as not to
> screw up 'git bisect'... i've still not fully spun up on git

I wrote up a thing about this once:

  http://landley.net/writing/git-bisect-howto.html

Git (and mercurial) let you binary search between a known good commit
and a known bad commit to find where the bug was introduced.  (The same
thing can be used to track down a fix if you need to backport it.)

Basically, you go:

  git bisect bad BADCOMMIT
  git bisect good GOODCOMMIT

It then checks out a revision, you test it, and then go either "git
bisect good" or "git bisect bad" depending on the test results.  Did
that version have the bug or not?  Eventually, instead of doing a
checkout, it'll spit out a commit description.

The problem comes in when you get to a commit that is broken for a
reason OTHER than the bug you're looking for.  If it doesn't compile,
you can't test it.  Bisecting to find a bug is easiest in an otherwise
clean tree.  During development when you merge lots of stuff, you can
get into an area where dozens of commits in a row don't work for an
unrelated reason.

You can of course stop what you're doing (git bisect log > tempfile),
track down the fix for that bug (start saying "bad" for commits that
exhibit that bug and "good" for ones that don't and it should converge
on the commit that fixed it), save that commit as a patch (git show >
file.patch), back up to where you were bisecting (git bisect replay <
tempfile), and then

But it's a flaming pain and very unreliable.  You wind up doign a lot of
hand hacking.  Or you just say "git bisect skip" meaning "I can't test
this commit, gimme another one" but A) the git logic to find the next
commit is deeply stupid and will give you an ADJACENT commit rather than
trying to chop off 1/3 from a random end of the range and then hopefully
the midpoint of that will be somewhere sane, but no that's not what they
do, you loop for DOZENS OF COMMITS testing all the adjacent commits in
the same broken range, which is a huge waste of time...)

Bisecting is easy when it works, and really painful when it doesn't, and
when it doesn't it's because you have overlapping bugs preventing you
from testing whether or not a given commit has the one you're interested
in.  I've stacked FOUR BUGS DEEP in kernel commits sometimes because
non-x86 platforms don't get tested the way x86 does.  Did I mention that
2.6.39 shipped with mips not _booting_ in my config?  The mips
maintainer was very nice about tracking it down, but you hit stuff like
"somebody broke the stat structure by swapping a 16 bit value for a 32
bit value and screwing up the offsets of all the fields after that,
meaning ls -l doesn't work on 32 bit sparc and hasn't for 4 releases and
NOBODY NOTICED.  Why did nobody notice?  Because they're all still using
the last Debian sparc stable release, which has a kernel that's THREE
YEARS OLD...  That was fun.

One of the points of aboriginal linux is to smoketest the basic package
functionality across multiple architectures, because the people doing
real hardware all wait for everybody ELSE to debug it before upgrading,
and they cluster anywhere from 1 year to _5 years_ back depending on
platform.  At work we're using 2.6.37 on arm, which is pretty darn
cutting edge for most production environment's I've seen.  (Only 2
releases back!  Yes, our USB3 bug was fixed by trying a more current
kernel, and next week I gotta find and backport the fix, or something.)

>>> I've been lucky in that I've worked on teams that have used
>>> distributed source code control tools all thru the 90's... Suns
>>> Teamware/Codemgr. Larry McVoy wrote the original in perl during
>>> the early Solaris development 1991/92
>> 
>> Sun being used as a _good_ example of something.
> 
> 
> LOL... I worked on 2 outstanging releases at Sun... 1990 Sparcstation
> 2 on Sun OS 4.1.3 and 1995 OpenGL team... the rest were nothing to
> write home about...

I graduated from college in 1995.  That was the last time I physically
used a Sun workstation.  The server in the back room was Solaris, the
workstations in the lab were SunOS, and they weren't _entirely_
compatible...

> As with any large company, there was some great talent, and some
> not-so-great. I basically think the beginning of the end was in
> 1989... when the got rid of all field engineers and out-sourced, and
> went from being a pretty good integrator of a number of vendors
> technology and adding their stuff in, to being Sun-on-Sun ... during
> the 1991-1993 years... Lots of folks don't realized that if Cray does
> not go bankrupt sell the SparcServer 64 processors to SGI, and SGI
> does not sell to Sun...  Sun does not have a million dollar machine
> to sell to a bunch of people all thru the late 1990's....

I wrote half a document on my understanding of "the sun civil war" once.
 Being forced out of the workstation market into a server-only niche
cost sun its' identity and old sun went into huge denial about what it
was.  (Without workstations, what dothe developers use?  The whole
solaris-x86 debacle: what do you THINK your developer workstations are
running, a $30k production machine or a $2k desktop?  You tried to
cancel the product line 3 times.  Brilliant.)

Along the way java happened, and that tail wagged the dog for quite a
while, with Java development sheltering the growth of some VERY
different software development philosophies with goals that conflcited
directly with the old sun.  The "We give Java away for free, open source
is great" crowd (openoffice, opensolaris, eventually gpl java itself),
the "Java runs on desktops, we need a position on the desktop" crowd
(buy staroffice, produce looking glass, and a HUGE amount of nostalgia
from their history as a workstation provider), the "most Java
deployments are windows, we must kiss the black widow" crowd (brought in
$2 billion payment from Microsoft shilling for SCO, that gave them a lot
of political power)...

And of course when Bill Joy went nuts and they swapped him out for a
spare co-founder (Andy Bechtolsheim or whatever his name was), and HE
brought in Opterons, the monolithic "Solaris is sparc, sparc is solaris"
block cracked right down the middle, and EVERYTHING became a poltical
football.  Looking glass is Linux based?  No, switch it to Solaris.
Open source solaris (waaaaaah!) under a license nobody else will ever
use so it can't share code: eh, GPLv3 counts there.  GPLv2 Java at the
copyright leve, but sue over patents!  Open source openoffice but
require copyright assignment so nobody outside of Sun contributes any
code to it!

Honestly, Sun was doomed when it lost the high end workstation market
(to Windows NT of all things, but really to commodity PC hardware).  The
Pentium showed x86 wasn't going to be taken down by CISC/RISC
considerations, Sparc was dead and just didn't know it at that point.
And Linux was _obviously_ going to eat proprietary Unix circa 1993 (just
a matter of time).

Sparc and solaris stopped being _interesting_ in the early 90's.  Yeah,
the dot-com boom fed enough sugar and amphetamines into their system to
keep them upright despite several missing internal organs, and they
COULD have invented a whole new business given that time, but their old
one was doomed.  (They tried inventing a new business with java, but as
with so many dot-com things costs going to zero, revenues going near
zero is great if your company is a team of six people working out of a
strip-mall, but if you have an existing 10k employees to support you're
screwed.)

Sorry, tangent. :)

> I think sun made real good hardware/software systems...

Real good 1992 hardware is junk by 1995.  Cheap plastic white-box PCs
running Linux ate their use cases, doesn't matter whether you're talking
sparc, alpha, itanic...  Being a high-end niche product in the face of
Moore's Law is not a stable position.  (Apple is high-end due to
industrial design, not due to performance.  Their price premium comes
from the hardware and software being _pretty_ and fun to use, not due to
outperforming anything on any empirical metric.)

> but when
> mcnealy and crew thought they could become a pure software play... it
> was one failure after another. they lost focus. in the 80's...

You've read http://www.blinkenlights.com/classiccmp/javaorigin.html right?

Sun was doomed by cheap commodity PC hardware.  When the pentium showed
CISC can translate to RISC internally, the _smart_ risc players went
down to the embedded space to compete on power consumption to
performance ratio rather than price to performance or absolute best
performance.  They got out of x86's _way_, and let it continue upmarket
so they could attack its soft underbelly.

If AMD hadn't forced Intel to do the celeron, and if Clayton Christensen
hadn't explained the situation to Andy Grove and gotten him to SMACK
SOME SENSE INTO HIS ENTIRE COMPANY, as documented here:

  http://www.forbes.com/forbes/1999/0125/6302088a.html

Then Intel would have gone off a cliff and would probably be in Sun's
shoes right now, instead of repeatedly ratcheting downmarket (Pentium 4
was as dead end as Itanic, but Pentium M led to Core 2 duo.  They bought
strongarm and then sold it again to marvell because it wasn't x86, but
they're doing Atom instead which is what my current netbook has).  For a
company like Intel to jump downmarket is like salmon swimming upstream
in big leaps, but Christensen drove home the necessity of doing so RIGHT
before it was too late for them to do it.

Sun gleefully ran upmarket and off a cliff, classive failure of
sustaining technology against disruptive atack.  The lateral thrashing
was a dying thing nailed to a wall trying to find an escape, it just
took them a LONG TIME to die.

> they
> were at that time the fastest company at going from $0 to 1
> billion... and completely lost focus after that.

Get a copy of The Innovator's Dilemma by Clayton Christensen.  Marvelous
book.  (New York Times #1 bestseller in 1997.)

> But ... teamware/codemgr  was much much better than using rcs, sccs,
> cvs, perforce (although perforce is reasonable to chip development
> teams)

Never heard of teamware/codemgr, but the above comparisons are called
"damning with faint praise".

Linux kernel development kept an archive of tarballs with patch files
between 'em rather than deal with any of that, because doing it by hand
with tarballs was an honestly superior source control system to any of
that crap.

> we actually had distributed branches on hp-ux in 1988... but we'd
> still all share the tree old-school. when i arrived at Sun in 1990,
> they had had smerge.... clone and merge sccs trunks, but we still
> shared the tree old-school... it was not until Larry wrote smosh...
> wrapped it in perl scripts and called it NSE-Lite and then they
> re-wrote in C and shipped it as a product... teamware/codemgr... Sun
> tried to create a clearcase like cm system... called NSE... it failed
> miserably
> 
>> 
>> I need a moment...
>> 
>> I have two todo items:
>> 
>> 1) Make a cron job that builds all targets for every commit and
>> runs the smoketest scripts on all of 'em.
> 
> yep. i often question myself, and the engineering community on how
> when the web came along we as a community lost our sense of
> simplicity... it's like tools folks see every problems as a web
> application...

The people with a sense of simplicity wound up doing embedded
development.  "Gimme a small system and I'll make it sing.  My commodore
64 had 38911 basic bytes free and we made it work.  You keep wanting to
write a novel instead of a short story.  I grew up doing haiku, and this
calls for a drabble: luxury."

> basically... my motto is focus on the individual contributor and how
> to scale... but our industry does not go at it like that... they go
> at it from the project and release level...

You probably want to read those "three waves" articles I keep pointing
you at.  Everything I have to say on the topic of individual vs group
contributors is prefixed on the material in those...

Rob



More information about the Aboriginal mailing list