[Toybox] Toybox roadmap.

Rob Landley rob at landley.net
Sun Jan 22 16:37:11 PST 2012


The switch to BSD involved removing 5 commands I couldn't contact the
author of to get permission to relicense them (basename, dirname, touch,
mkfifo, and tty).  Since then, we've reimplemented tty, dirname, and
basename, and added wc, link, nohup, unlink, cal, truncate, unshare, and
env.

I've also expanded the http://landley.net/toybox/code.html#lib_args
section of the documentation.

The rest of this post is about the new wiki page.

The new wiki page is http://elinux.org/Busybox_replacement which isn't
necessarily exclusively about toybox, but describes a demand toybox
should be able to supply, so I've been filling that out in hopes of
attracting the corresponding userbase.

I've vaguely aiming at a 1.0 release of Toybox this year, and in my
copious free time (of which I have none), I've spent the past month or
so triaging the todo list, to figure out what commands should be in it.
 I put them on the above wiki, so you might want to pull that up and
read along.  This message explains the reasoning in a bit more detail.

I've got four existing use cases: aboriginal/LFS, android toolbox,
busybox, and susv4.  Let's start with Android:

--- Android toolbox commands

1) Why Android is important.

I blogged about this at http://landley.net/notes-2011.html#26-06-2011
but the tl;dr version is "mainframe -> minicomputer -> PC ->
smartphone". The smartphone is already kicking the PC up into the server
space the same way minicomputer terminals made people stop waiting
around for their mainframe printouts, and the way having a PC on your
desk took the minicomputer terminals away and turned it into a "file and
print server".  This time around the computer in your _pocket_ renders
the computer on your _desk_ redundant, and they're calling being kicked
up into the server space "the cloud".

Android took over the smart phone because vanilla Linux can't: open
source development model cannot cope with aesthetic issues.  Any time
"shut up and show me teh code" is not the correct response to the
problem at hand, our development model melts down into one of three
distinct failure modes: 1) endless discussion that never resolves into a
course of action, 2) fork the project to death implementing every
possible approach and never being able to merge them back together, 3)
delegate the problem to nobody by making it so configurable it takes
weeks to learn what your options are and then the fact it still has no
sane defaults is now somehow the end user's fault.

Open source can't do user interfaces for the same reason wikipedia can't
write a story with a plot.  Too many cooks spoil the soup; a committee
can make it extremely nutritious, but it won't taste very good.

I blogged about this at http://landley.net/notes-2010.html#13-08-2010
and elsewhere.

2) Why replacing android toolbox is a giant flaming opportunity.

Android toolbox is crying out to be replaced.  Both "bionic" (the C
library) and "toolbox" (their command line) are minimal stubs providing
just enough functionality to boot "dalvik" (their Java runtime), and
then everything else happens in Java.  People doing embedded Linux are
switching over to android because it's got the smart phone market locked
up (and thus taking over the world).

I spent years enhancing busybox until you could base a development
environment on it, so the easy way to get a posix environment is to
install busybox.

Problem: busybox is GPL, and android has an official policy "no GPL in
userspace".  The kernel got grandfathered in, but mostly GPLv2 got
tarred with the same brush as GPLv3; the jar-jar binks of licenses
undermined the original a bit like the second and third matrix movies
made the first seem not so clever after all.

I blathered about the state of the GPL in more detail at
http://lwn.net/Articles/475901/

--- Aboriginal Linux

Aboriginal Linux provides a big boost to Toybox because:

1) Android must inevitably become self-hosting.

A platform that isn't self-hosting isn't mature yet.  It's still
dependent on other platforms, the way PC code was originally
cross-compiled from minicomputers.  Once the PC grew its own compilers
and started to seriously use them, the minicomputer drastically receeded
in importance.

(Unix itself became a real OS once it stopped needing to be
cross-compiled from a GE mainframe.  All operating systems mature this
way; those that don't never last very long.)

On a hardware level, Android is essentially self-hosting now.  Even
early Android hardware such as the "nexus one" had a gigahertz
processor, half a gigabyte of ram, up to 32 gigs of SD flash storage,
and USB.  USB allows the user to hook up a USB docking station (such as
the Toshiba Dynadock), providing two full-size monitors, mouse and
keyboard, sound, gigabit ethernet, and external hard drives, all while
charging the battery.  Newer phones offer more memory and SMP
capability, and the new ARMv8 instruction set is 64 bit.

On the software side, Android needs native build tools, including a
compiler and native POSIX command line.

Aboriginal Linux is intentionally the smallest, simplest self-hosting
Linux system, which is then regression tested by building Linux From
Scratch under the result (proving you can bootstrap your way up from
there to arbitrary amounts of complexity on the new target).  It
eliminates the need for cross compiling, which Android must outgrow to
get widespread native development from third parties.  (The current
locked-down Java environment has signs of a transitional environment, a
bit like the built-in BASIC on the original IBM PC.)

2) Ready-made real world regression test helped make BusyBox what it is
today.

Aboriginal Linux is the reason I got involved with BusyBox in the first
place.  The development effort I put into busybox was directed at making
it a better component of Aboriginal Linux (previously called Firmware
Linux).

My original goal was to replace the Gnu/Dammit project's tools in use
cases like "Knoppix" Live CDs and such. Even when storage or memory
aren't obvious limiting factors, _complexity_ is. (Especially for
anybody who's ever done a security audit, ported code to a new platform,
had to keep old code working in the presence of system upgrades...)  In
order to prevent these tools from resurfacing, the system had to be able
to rebuild itself from source code without them.  Hence targeting a
self-hosting development environment.

Aboriginal Linux currently builds a fully functional development
environment around busybox.  This provides a big automated test
environment piping real world data through these commands, proving
Aboriginal Linux retains the ability to rebuild itself under itself, and
to build Linux From Scratch under the resulting environment, without any
of the gnu/dammit command line crap (other than the compiler toolchain,
which I'd still like to replace).

When I was working on busybox, I would drop individual busybox commands
into my system build, and fix up the resulting breakage.  I can
similarly drop toybox commands into my current build, allowing the new
project to advance as rapidly as the previous project, by benefiting
from the same development agenda and ready-made test suite.

--- Busybox

"That thing out there's become a killer! It's _my_ fault, and I'm
_sorry_." - The Second Doctor, "The Three Doctors".

Replacing BusyBox is a big opportunity for two reasons:

First, the legal/licensing issues.  The busybox lawsuits I started back
in 2006 snowballed out of control, and by 2008 I couldn't stop them
(plenty of other copyright holders they could represent to keep going).
 It turned the SFLC into a self-financing legal machine that the FSF
hijacked to randomly attack our allies, and although this split into
SFLC and "software conservancy" that justmeans there's now TWO possible
sources of GPL lawsuits.  (That's just in the US, over in europe there's
FSF europe and gpl-violations.org.)

This means busybox is perceived as one of the most legally dangerous
pieces of software you can install, so the "no GPL in userspace" thing
for Android ESPECIALLY means busybox.

The second reason to replace busybox is that in the past five years,
busybox itself has snowballed into a giant hairball of complexity, now
including around 350 commands (over twice as many as the whole of
POSIX).  One of the main advantages of busybox used to be that the code
was simple and readable, but now the code is a forest of #ifdefs (often
testing the ENABLE macros, which defeats the original purpose of having
them), it has an "applets" directory but the actual main() function is
buried in libbb, declaring a new command involves repeating the main()
line twice (once with magic MAIN_EXTERNALLY_VISIBLE macro, once with a
magic UNUSED_PARAM macro), for some reason it has an arch/i386 directory
(but no other targets), the Makefiles are copied from the Linux kernel
and spend much of their time implementing the ability to build "modules"
(whatever that means in this context), don't ask me why
coreutils/length.c.disabled wasn't just deleted...

Back around 2005, busybox was a decent teaching tool introducing new
coders to real C programming.  Now it's a mess, and getting messier.  It
seems to have no criteria by which it can _exclude_ features.  It's an
embedded project unable to say "no", or clearly state what its
boundaries are.

--- Single Unix Specification version 4

"Standards should document, not legislate." - Me.

Standards compliance is nice.  The best standards are the kind that
describe reality, rather than attempting to impose a new one.  The
"describe reality" kidn of standards tend to be approved by more than
one standards body, such ANSI and ISO both approving C.  That's why the
IEEE POSIX committee's 2008 standard, the Single Unix Specification
version 4, and the Open Group Base Specification edition 7 are all the
same standard from three sources.  The "utilities" section is devoted to
the unix command line:

  http://opengroup.org/onlinepubs/9699919799/idx/utilities.html

Unfortunately, these standards describe a subset of reality, lacking any
mention of init, login, mount...  (And it provides ipcrm and ipcs, but
not ipcmk.  What?)  They also contain a large number of commands that
are inappropriate for toybox to implement.

We start by removing generally obsolete commands (compess ed ex pr
uncompress uccp uustat uux), commands for the pre-CVS "SCCS" source
control system (admin delta get prs rmdel sact sccs unget val what),
fortran (asa fort77), and batch processing support (batch qalter qdel
qhold qmove qmsg qrerun qrls qselect qsig qstat qsub).

Some commands are for a compiler toolchain (ar c99 cflow ctags cxref
getcat iconv lex m4 make nm strings strip tsort yacc), which is outside
of toybox's mandante.  (Supporting a build environment is not the same
as providing one.)  Some of these may be revisited later, but not for
toybox 1.0.

Some commands are part of a command shell, and cannot be implemented as
separate executables (alias bg cd command fc fg getopts hash jobs kill
read type ulimit umask unalias wait).  These may be revisited as part of
a built-in toybox shell, but are not part of $PATH.

A few other commands are judgement calls, providing internationalization
support (iconv locale localedef), System V inter-process communication
(ipcrm ipcs), and cross-tty communication from the minicomputer days
(talk mesg write).  The "pax" utility was replaced by tar, "mailx" is an
obsolete email client, and "lp" submits files for printing to... what
exactly?  (cups?)

Removing all of that leaves:

  at awk basename bc cal cat chgrp chmod chown cksum cmp comm cp crontab
  csplit cut date dd df diff dirname du echo env expand expr false file
  find fold fuser getconf grep head id join kill link ln logger logname
  ls man mkdir mkfifo more mv newgrp nice nl nohup od paste patch
  pathchk printf ps pwd renice rm rmdir sed sh sleep sort split stty
  tabs tail tee test time touch tput tr true tty uname unexpand uniq
  unlink uudecode uuencode vi wc who xargs zcat


More information about the Toybox mailing list