[Aboriginal] Replacing bash with mksh

Rob Landley rob at landley.net
Mon Jun 10 20:44:37 PDT 2013


On 06/10/2013 02:33:06 AM, idunham at lavabit.com wrote:
> On Sat, Jun 08, 2013 at 02:00:49AM -0500, Rob Landley wrote:
> > On 06/04/2013 09:29:47 PM, idunham at lavabit.com wrote:
> > >> Ideally I want to get toysh implemented ASAP, and might take
> > >some code
> > >> from the public domain version of mksh to help that. In practice,
> > >> neither of my last two day jobs has been particularly amenable to
> > >> having as much free time to program as I'd like, and scheduled  
> stuff
> > >> likekeeping up with kernel releases (3 targets currently broken  
> in
> > >> 3.10-rc4) take priority. Plus I should really finish the musl
> > >> switchover...
> > >> Rob
> > >
> > >That would be OpenBSD pdksh, for which I'd strongly recomment
> > >using one
> > >of these two Linux ports:
> > >https://github.com/bobertlo/openbsd-pdksh
> > >http://slackbuilds.org/repository/14.0/system/ksh-openbsd/
> > >IIRC, I heard that the latter includes fixes for a number of
> > >things that
> > >"are subtly broken in OpenBSD".
> >
> > For public domain stuff, I'm happy to have the PD version open in
> > one window while I write another one in a second window, and use it
> > as a template. But we've got our own library of reusable code that
> > nothing else has, and you've seen how much
> >
> > >I forget exactly how to go about applying the patches from the  
> latter
> > >source, but it didn't take that long.
> > >The libbsd dependency is for one or two functions that are in musl.
> > >
> > >That said, I'll warn you: there are well over 20 C files to go
> > >through.
> > >And there are a few builtins that we have, like mknod.
> >
> > I noticed this. :)
> >
> > I'm not opening the shell can of worms until I get the current set
> > of loose ends tied off a bit more first. I got "stat" out of
> > pending, I'm working to get ifconfig out of pending but the limiting
> > factor at the moment is I don't understand what command lines it's
> > parsing and there's no ifconfig.test. (set_sockaddr() is called by
> > both set_address() and set_ipv6_addr(), and I _think_ 3/4 of
> > set_sockaddr() is ipv6-only stuff but... I've used "ifconfig eth0
> > 192.168.1.42/24" back in 2002 and this ifconfig implementation only
> > has / parsing in the ipv6 path, and it seems to be doing something
> > else...? I'm not familiar with ipv6 enough to understand what this
> > is doing, so I'm trying to learn these funky corner cases I've never
> > used so I can understand, untangle, and test...)
> >
> > I've got a p9d server in progress, and I've dug up my mount
> > implementation because I've had _three_ mount implementations
> > submitted ot me now so I need to get that in...
> 
> So there are now four versions of mount floating around, none of which
> are publicly available?

I think a couple got posted to the mailing list? (People keep emailing  
me off list. Dunno why. The most recent is attached, the previous one  
was... october I think?)

Plus klibc has one, android toolbox has one...

Last time I sat down to write mount, it used loopback stuff first so I  
had to write a proper losetup (which I did, although that had grown a  
lot of new features since last I'd checked). Then I started researching  
all the new mount flags, and I have a stub that can enumerate them and  
print out help text for them (also attached). The "actually does  
something useful" variant trails off in the middle and does not  
currently compile...

Also, keep in mind that I did several months digging into the guts of  
NFS back in 2010 and would like to use the new string interface instead  
of the old binary interface:

   http://landley.livejournal.com/52663.html

Basically all the "smbmount" and "nfsmount" and such front-ends are  
doing three things:

1) prompting you for a password from /dev/tty and adding "pass=" to the  
options string.

2) DNS lookups so you can say \\hostname or similar instead of an IP  
address.

3) converting the command line options a bit so you don't actually have  
to say "-t nfs" or "-t cifs" when you use that front-end.

I.E. if you know what you're doing you can do an NFS or samba mount  
from a mount command that doesn't know anything special about that  
filesystem. (But if the command DOES know about it, it can intercept  
what you actually said and screw it up.)

Having large amounts of extra code for this is silly...

> > Plus I still have a todo list of commands to triage that predate
> > "pending" (vmstat, login, du, vconfig, mountpoint, free, chroot,
> > cut, touch modinfo, expand) and some of them have shrapnel in
> > lib/lib.c (for example "cut" added get_int_value() which is
> > something lib/args.c already does...)
> >
> modinfo seems to be good for uncompressed modules afaict.

Needing to triage doesn't mean they don't work. The ifconfig command  
worked before I started triaging it, it was just twice the size of the  
current one (and I'm not done yet).

I like to do polishing passes. If nothing else, I like to have the  
design of toybox commands clear in my head so I can support them. Some  
of the commands I want to review may not wind up having any actual  
_changes_, but right now I haven't read them closely enough and thought  
through the interactions to say they _don't_ need changes.

Tangent warning: feel free to skip this bit:

Something in my old prototype and fan club talk (on  
http://landley.net/talks but it's not my finest hour/recording) was  
that reading code is harder than writing code. When writing code you  
have a complete model in your head and what's on the screen trails the  
model in your head. When _reading_ code the model in your head trails  
what's on the screen, _and_ the order of execution is neither linear  
nor deterministic with all sorts of potential jumps to other functions  
in whole other files so you need five tabs open to follow the plot.

This is part of the reason there's so much "not invented here" stuff in  
programming: even when it's perfectly good code, if you wrote it you  
know what it does, and the amount of reading and pondering necessary to  
get that level of familiarity with code you _didn't_ write is always a  
lot higher than you expect.

(This is one of the differentiators between ok coders and really good  
ones: being _aware_ that code isn't _bad_ just because you didn't write  
it. It's harder for you personally to understand, but that's true of  
everybody. Is "does it smell like me" actually an improvement, or is  
the rewrite just causing pointless churn?)

In toybox's case I do have the luxury of doing a bit of smells-like-me  
rewriting because I expect to have to support this going forward.  
(Privilege of being a project maintainer.) To be brutally honest,  
commits like these:

   http://landley.net/hg/toybox/rev/765
   http://landley.net/hg/toybox/rev/767

are as much "ok, now I'm comfortable with it" as actually improving the  
code. (And because I can use source control to check if I'm the last  
committer to files as a way of seeing what I've reviewed and what I  
haven't.)

But when I'm done with those sorts of patches I go "is this worth  
checking in" and sometimes I discard it instead. And doing that sort of  
thing to a project you're _not_ the maintainer of is just rude. There  
are a lot of times I open up a second window, transcribe my own version  
of whatever I'm reading, and then _delete_ that new copy when I'm done  
because now I understand the old one and I'd just be peeing on it so it  
smells like me rather than actually improving anything.

(I'm not saying all polishing is cosmetic, although it is an easy trap  
to fall into. But this is why "I need to find time to review this"  
doesn't necessarily mean there's something known to be wrong with it.  
And also why "I've got five minutes, let me read through cut.c" is not  
the same thing as a proper review: a proper review can balloon to 4  
hours even if all I did was something "indent" could have done...)

> I'd write a test for it if I could figure out how to do one that  
> works with
> all Linux hosts...but the corner cases it makes sense to check may  
> not be
> present on many systems.

I don't understand. Care to explain?

> (Coincidentally, toybox modinfo does the right
> thing for "modinfo oss-usb" where busybox fails; oss_usb.ko is the  
> file
> name, it's part of the out-of-tree OSS4 code...)

If I recall, modinfo doesn't parse the elf tables but just does a  
string search of the binary. I'm not sure I'd call that "the right  
thing", it seems extrordinarily brittle to me.)

> Examples of test cases:
> modinfo a-module -> a_module.ko
> modinfo a_module -> a-module.ko
> modinfo radeon
> (mile long list of firmware that changes with each kernel version
> follows)

I'm always happy to fluff out the test suite, if you'd like to send a  
patch. :)

> > I grabbed the original massively stale version from 1999, on the
> > theory there's probably less of it to read. "wc -l *.c" says the
> > total is 27180 lines, which is a couple solid days triaging right
> > there.
> 
> 23460 for ksh-openbsd, patches applied.
> So someone has achieved negative code growth.

Yay?

Is it still public domain? I removed the "and you must copy this  
license text into all modified versions so even though you can  
relicense it GPL you _also_ have to include incompatible license terms  
and then explain why they don't apply" clause from toybox's license,  
which is essentially public domain now but _looks_ like BSD (and I  
describe it as such) so as not to freak out the corporate types who go  
"Jim Butterfield's public domain programs may have been on the  
Commodore Bonus Disk shipped with 1541 disk drives in 1982 but don't  
act like that's a precedent! This is weird untried unproven... it  
doesn't give us any way to get paid to advise you!"

Sorry, tangent. (Bit of a headache. Late for lunch.)

> > >> Have been reading about it... I am thinking of replacing my user
> > >shell
> > >> from bash to mksh..need some more time to read up on it
> > >
> > >No real issues for most things; I've used mksh as login shell on my
> > >netbook for a few years, due to the bloat of modern bash with all
> > >extensions loaded.
> > >Even shell expressions (like a.{c,sh}), work the same.
> >
> > So it's already better than the Defective Annoying SHell, then. :)
> >
> > Which fork are you using?
> 
> Debian/Ubuntu package, maintained by Thorsten Glaser (an upstream
> developer.)
> But it's also true of at least the Debian Squeeze pdksh package.
> Of course, ostensibly "some non-portable" features get disabled when  
> you use the
> name "sh"...haven't tested which ones.

That's true of bash, though. :)

> (BTW, dash is Debian's fork from 2002 of an ancient port of NetBSD's
> Almquist shell.  AFAIK it is technically conformant to POSIX, and even
> supports a few bash-like features...but nowhere near all the ones  
> people
> use.)

I looked up the pedigree of that monster once upon a time.

My objection is that running bash was the reason Linus Torvalds added a  
system call layer to his terminal program. (Uploading/downloading files  
from the vax at his university was the reason he added a filesystem  
layer.) This made bash the default shell of Linux before the 0.0.1  
release, which it continued to be until Ubuntu unilaterlly changed it.  
When ubuntu did so, they broke the kernel build. Their stated reasons  
for doing so were not sufficient justification for doing it, _and_ they  
failed at their stated goal (and had to introduce upstart as another  
way of addressing the problem) without admitting their mistake and  
reverting it.

So it's not dash that's at fault per se. It's ubuntu was ignorant of  
history, caused massive collateral damage, made a bad move by their own  
reasoning, failed at their stated goal, and refused to revert their  
mistake after further developments rendered their original reason moot.

Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: other-mount.c
Type: text/x-csrc
Size: 36296 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/aboriginal-landley.net/attachments/20130610/4e810978/attachment-0006.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mount.c
Type: text/x-csrc
Size: 3131 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/aboriginal-landley.net/attachments/20130610/4e810978/attachment-0007.c>


More information about the Aboriginal mailing list