[Toybox] loopfiles vs dirtree...

Rob Landley rob at landley.net
Fri Aug 16 12:53:59 PDT 2013


A "thinking out loud" post, feel free to ignore.

While adding -r to grep I hit a design issue. There's an idiom that  
looks like:

   for (s=toys.optargs; *s; s++) {
     struct dirtree *new = dirtree_add_node(0, *s, hl);
     if (new) dirtree_handle_callback(new, do_chgrp);
     else toys.exitval = 1;
   }

Which is using dirtree instead of loopfiles. This is repeated in enough  
commands it probably belongs in lib, but I don't want to add _another_  
method of iterating over a file list to lib, so I'm wondering if maybe  
I should just switch all the loopfiles instances over to dirtree.

For things that iterate over a list of files on the command line andcan  
recurse (cp, ls, mv, working on grep, upcoming includes tar and  
cpio...), starting with dirtree makes sense. (Hence the repeated inline  
loop above.) But not everything that iterates over files on the command  
line does recursion. The downside is we'd suck in noticeably more code  
when doing standalone command builds that _don't_ need dirtree, things  
like cat and md5sum that really don't care. (And from a runtime  
perspective, dirtree mallocs a largeish structure and has an extra stat  
syscall.)

I think BusyBox went overboard on configurability with a dozen config  
options for things like "tar", but the ability to select individual  
commands is useful, as is the ability to build them standalone.  
(There's a tradeoff between sharing common infrastructure and reducing  
dependencies for standalone builds that doesn't have a right answer, I  
just try not to make either case suck more than necessary.)
  Not a huge deal, but

Let's see, current loopfiles users: md5sum, bzcat, catv, dos2unix,  
readahead, rev, tac, truncate, cat, cksum, cmp, expand, head, od, sort,  
split, tail, tee, wc. None of those currently use dirtree and only one  
(split) uses stat.

Current dirtree users: losetup, lsusb, modinfo, switch_root, taskset,  
chgrp, chmod, cp, du, ls, rm. Except only the last 6 of those iterate  
over command line arguments, the first 5 don't.

Hmmm... I guess adding a second iterator function to lib is the best  
way to go. Even then, cp needs its own, du needs... to be cleaned up,  
ls is ALMOST regular enough but it's doing nomalloc, and rm has a test  
in there to avoid rm -f reporting an error if it's not there. It's  
really just chgrp, chmod, and now grep that would be sharing it...

Not sure it's worth putting it in the library yet. Maybe after I clean  
up du.

Rob


More information about the Toybox mailing list