[Toybox] project goals (was Re: sed -e '$a\')

Rob Landley rob at landley.net
Fri Apr 15 16:12:34 PDT 2016


On 04/15/2016 04:26 AM, Andy Chu wrote:
> (answering these old mails in chronological order... I was working
> heads down on the shell for 2 weeks, hence the silence)

I need to do that. I have open reply windows from that far back just on
this machine (which was reinstalled recently).

> On Sat, Mar 26, 2016 at 1:00 AM, Rob Landley <rob at landley.net> wrote:
>>
>>
>> On 03/26/2016 01:36 AM, Andy Chu wrote:
>>>> The problem to solve here is that Debian is currently broken when using
>>>> toybox. That's a real problem that needs to be fixed, but there's more
>>>> than one way to fix it. Debian introduced this bug within the past few
>>>> years and maybe we can convince them to fix it on their end, that's one
>>>> possible fix.
>>>
>>> What's the goal of toybox with respect to Debian?  I looked through
>>> the design and roadmap pages and didn't see much about Debian
>>> specifically.
>>
>> I take it you haven't read http://landley.net/aboriginal/about.html
>> (specficially http://landley.net/aboriginal/about.html#hairball ).
> 
> I think there is a good point here, to have a universal foundation and
> root of trust for different distros, although I wonder what work has
> been done toward this goal?

In what way?

When I started poking at busybox 15 years ago, it did not hold weight. I
eventually rewrote about 1/3 of it from scratch, and it took a lot of
poking even after I left until Aboriginal Linux 1.0 did what it set out
to do: rebuild itself under itself, and build linux from scratch under
the result the same way on a half-dozen architectures. That provides a
known working reference environment, if a bit stale in places (toolchain
especially).

Then I started over with toybox, and I'm still working on that. It's
more than halfway done, not sure how much more. (It's not just commands,
it's infrastructure.) I admit my $DAYJOB the past year and a half has
been a bit more interesting than the previous few, but I'm still
grinding away on this as fast as I can.

I documented "next steps" on the aboriginal about page. Untangling
hairball builds and bootstrapping distros are more or less the same
thing seem from two perspectives, but that whole can of worms comes
_after_ I replace busybox 100% with toybox in the existing build.

But since Android is already using toybox, there's an existing userbase
that pulls in a different direction than that, and there's also "if you
don't do $X now, we'll install $Y which is much more difficult to
displace later". (It's not a _requirement_ that android use toybox
implementations of stuff, for example I wasn't fighting their shell
decision, but it would be nice.)

And on top of that there's $DAYJOB which needs nommu versions of stuff
to replace old uclinux versions, plus the general j-core todo items ala
"resurrecting the superh architecture" on lwn.net (or our newer
http://j-core.org/ELC-2016.pdf talk which should have an associated
video whenever the linux foundation gets around to posting them.)

So I'm balancing multiple competing demands here. Hooking up with
https://lwn.net/Articles/630074/ and getting them up to speed on what
I've already done with aboriginal would be high priority... if I didn't
have a half-dozen higher priority todo items right now. :)

(Alas, these days qcc doesn't even make the list, and way back when I
put a LOT of work into http://landley.net/notes-2007.html#28-12-2007 and
now it's just not on my radar, although there were reasons for that.)

> I also wonder if the Debian maintainers would be on board with it.  It
> seems like it would increase the maintainers workload to test with
> both coreutils/etc. and toybox.  It's a pretty big moving target,
> because they update their packages with every distro release.

I'm interested in getting out and pushing on that at some point, but
it's a post-toybox-1.0 goal.

> I'm quite familiar with debootstrap, which is the big shell script
> used to bootstrap a Debian/Ubuntu system.

>From prebuilt binaries, with hardwired assumptions about what toolchain
and libc it's using. :(

Debian's in a lot better shape than most distros I've looked at, but its
minimal debootstrap environment is still rather large. Back when I wrote
http://landley.net/lxc/03-debian.html I triaged what the package list
was, but that was a while ago and the answer was "too big". (Hmmm, it
wasn't http://landley.net/notes-2011.html#17-05-2011 so I'll have to dig
to find it. Or just do it again...)

> You can use it to create
> just the runtime environment, or also the build environment, which can
> in turn rebuild the set of required or buildd packages.

According to http://landley.net/notes-2013.html#20-04-2013 I need
"debuild" to build from source. I wonder what that was? (3 years ago on
a different computer...)

> It's like a big 4000 line shell script, and it doesn't seem like it is
> very highly maintained (it's extremely inefficient, for one thing).
> I guess the best thing you can say about it is that it does its job.

Have you tried qemu-debootstrap? Fun for the whole family:

  https://wiki.debian.org/ArmHardFloatChroot

Last year bootstraped various supported debian architectures in qemu
system images. I got like 4 of them working, but several more just
weren't properly set up for qemu (assuming bootloaders and such,
requiring much hackery to get around), and it sank down the todo list...

> (FWIW I was building scientific computing programs in R with
> containers from the ground up, with debootstrap, installing build
> dependencies of R like Fortran, building R, installing R packages and
> their runtime dependencies, and then installing R scripts on top... it
> was quite involved and I have several thousand lines of shell scripts
> that start with calling debootstrap to do this.)

The problem is the base dev environment is still something like 50
packages, many of which are... dubious. Or it was last time I poked at
it, which was a year ago.

>> My original goal when I started all this nonsense circa 2002 or so was
>> Knoppix. Linux From Scratch was over 100 megs but tomsrtbt was 1.7 megs,
>> and I thought if I could save Knoppix 100 megs of space on its CD they
>> could do WONDERS with it. The goalposts hae shifted a bit over the
>> years, but oddly enough knoppix was debian based too. :)
> 
> I like the Aboriginal goal of being the simplest linux system capable
> of rebuilding itself.  That is very crisp goal, and it has some nice
> applications in terms of provenance and security.

It's got a mailing list. :)

> My questions were more about the toybox goals.  The Debian goal seems
> fuzzy, and a moving target.  Maybe there's something I'm not
> understanding.

Poking at debian is post-1.0. Lotta bridges under the water between
there and here, plans tend to change so I'm not worrying about much
detail yet.

For toybox 1.0, I have a roadmap.html with 3 main overlapping goals:

1) Satisfy "the standards" with a checklist of commands/features from
posix and LSB and staring and man pages. (Which is WAY more grey than I
like because posix is so amazingly primitive it still thinks sccs is a
pretty neat idea, and the others give ad-hockery a bad name.)

2) Replace busybox, bash, and probably make in aboriginal linux. (This
goal is concrete, but

3) Whatever the Android guys need. A literal billion existing users are,
you know, a thing.

If you ask the priority of those things, it's juggling. Which ball is
coming down now?

Then after toybox gets to 1.0, the distro that's most interesting to
bootstrap under the resulting environment is AOSP. (And I know that'll
need at _least_ a read-only git downloader.) Debian is probably the
_EASIEST_ one to bootstrap on arbitrary new targets. (Alpine doesn't
support a lot of architectures.)

(With ninja replacing make maybe I can probably bump make to the
post-1.0 list, dunno... have to think about it when I get there.)

> Also, I would consider stuff like "factor" and probably "rev" to be
> cruft in toybox ... is there any evidence that people use that stuff?

factor was because I read
http://www.muppetlabs.com/~breadbox/txt/rsa.html on a long bus ride
(that's from the guy who did
http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html) and it
used "factor" and I went "where did they get that... it's in coreutils?"
and way back in college I'd written a program to calculate prime numbers
to see how fast the Sun workstations were, and I went "I could probably
write that much better now in fewer lines" and I had a half hour with
nothing better to do, and the result was small enough I checked it in.

Yeah, it's questionable, I admit. But it's small and it's in coreutils.

Meanwhile rev is in util-linux and busybox, and Elie De Brauwer sent in
an implementation so I merged it.

A couple years ago I bought one of those laminated "how to linux" sheets
from the rack near the registers at Fry's to see what THEIR command list
was (if you have about 6 pages to teach people basic LInux, what do you
include?) and lo and behold, they had rev on there. Not sure why. Maybe
because people love including it in tutorials:

http://www.thegeekstuff.com/2009/10/file-manipulation-examples-using-tac-rev-paste-and-join-unix-commands/
https://www.youtube.com/watch?v=kRsJ4ovQncM

> I mean sure you can compile it out, but it just seems like a
> distraction.

There's a lot of judgement calls in here. If you look at the "requests"
section at the end of the roadmap, people need the weirdest things. And
these were the ones where I didn't immediately say "no, that's
definitely out of scope for the project", so that's already a somewhat
filtered list. :)

> In other words, Aboriginal has a crisp goal, but toybox seems to have
> 10 different goals.  The goals here are not very crisp -- it seems
> like a bunch of research on similar prior systems:
> 
> http://landley.net/toybox/roadmap.html

Aboriginal is intentionally minimalist with hard lines: how small/simple
can I get a system that rebuilds itself under itself from source code.
The POINT of the project is to get an inventory of what a Linux build
system actually needs to function.

And even there, "less" and "top" and "vi" don't strictly need to be
included (build system vs development environment). And you could get by
without distcc (just an order of magnitude slower since qemu can't take
advantage of SMP on the host, and making it do so gets into
sub-instruction level interrupt granularity issues that turn out to
really suck in a dynamic translation context).

Toybox has a slightly squishier goal, because "what commands should be
in the standard linux command line" is not an easily answered question.
You'd think posix would be some help but they don't even have "init" or
"mount", and they DO have sccs, uux, and qdel.

I mostly err on the side of not including stuff: even busybox has about
100 more commands than we expect to implement (363 vs 173 current
defconfig, with ~60 files in pending and maybe another 30 beyond that in
the roadmap), and Elliott hasn't included everything we do have in his
config for android.

But the line is "what doesn't belong in toybox", not "what doesn't
belong on a linux system". And that, alas, makes it a tautology.

What I haven't done is make "defconfig" mean something other than "These
are the commands that work out of the box on most environments". The
allyesconfig build enables things like CONFIG_DEBUG you may not want.
Beyond that, I'm not making judgement calls about who needs what
commands. You can do that yourself with menuconfig. If I bothered to
implement and test it, it's in defconfig.

>>> Are you only talking about build time, and excluding runtime?  i.e.
>>> using toybox to bootstrap the builds of other distros?
>>
>> They're separate use cases. I want one to work and I want the other to
>> be a reasonable option.
> 
> So you want to rebuild all distros with Aboriginal/toybox, and having
> things work at runtime is a secondary goal?

No, I want to be able to replace coreutils and so on with toybox on an
ongoing basis.

I want a future where putting "GNU" next to "Linux" is like putting
"Homeopathic" next to "medicine", a warning that the people involved are
fundamentally going about it wrong.

> But I suppose Android's runtime is the main goal for that OS?  Because
> for now the runtime is going to be used way more than building... I
> don't think there are many people with compilers installed on their
> phones.

Yet.

Did you see my 2013 youtube talk? (Linked from the left edge of the
toybox web page, along with the outline of said talk?) I went over this
in some detail...

>> The amount of in jokes in my code is actually reasonably correlated to
>> the amount of sleep deprivation I was under at the time. (Like now, it's
>> coming up on 3am here.) The "struct {forever[];} strawberry fields[];"
>> nonsense in ps is a milder example. I edited most of that back _out_
>> before checking it in. Also that song actually has very little in the
>> way of lyrics so I COULDN'T pull a lot of variable names from it if I
>> wanted to.
> 
> FWIW this was preventing me from understanding the sed code when I was
> in there... :-/

Alright, I'll fix it...

>> A) it sucked in python as an environmental dependency, and the move from
>> python 2 to python 3 is a bit like the move from gpl 2 to gpl 3,
>> fragmenting the community into incompatible warring camps and making a
>> lot of old-timers walk. (It doesn't seem to be turning off newbies
>> nearly as much, though, so maybe python will survive this if then NEVER
>> do a python 4.0.)
> 
> Personally I think the best thing about Python 3 is that it froze
> Python 2.7 in stone.  Python 2.7 will still be with us 20 years now,
> and I'll probably still use it because it works great for a lot of
> things (I haven't used Python 3 yet really).

The Numato tool for flashing the cheap $50 FPGA boards we found for the
j-core project is written in python 3. I keep meaning to rewrite it in
python 2, but my todo list runneth over...

In my documentation I tell people to install Python 3, and it turns out
that the Ubuntu 14.04 python3-serial package is buggy so you have to
repeat the thing multiple times. (Nobody uses python 3 for much, it seems.)

>  It's like a de-facto
> standard for Python, since PyPy and a half dozen other implementations
> all target CPython 2.7.
> 
>> I was a longtime mercurial supporter, but bowed to the inevitable
>> eventually.
> 
> Yeah same here.  I briefly worked on Google Code, which did Mercurial
> hosting,  later added Git, and was recently shut down.

It's too bad this isn't still updated, it's so old Google Glass's grave
is dug but empty:

http://www.slate.com/articles/technology/map_of_the_week/2013/03/google_reader_joins_graveyard_of_dead_google_products.html

> Git is
> frustrating, but I get why it's popular -- it really is the best tool
> for collaboration.

Repo needs it therefore aosp needs it therefore the Gnu-free build
environment I'm trying to assemble needs it.

> It enabled me to juggle a lot of patches I sent
> you in parallel, something I never did with Mercurial.  I believe in
> code reviews, and when you have code reviews you will have overlapping
> patches which need to be updated asynchronously... that is not an easy
> problem to solve, and Git is the best at it.

Least bad, anyway. :)

> Andy

Rob

 1460761954.0


More information about the Toybox mailing list