[Toybox] poking at find.

Thu Jul 10 03:45:21 PDT 2014

Sorry for the radio silence, I've been poking at find.c and wound up
starting over from scratch.

Find needs to work on more than one directory. Doing "find .." is valid.
The -name filter is a glob(), not a regex (as the current one's comments
say, although that's actually a todo and what's implemented is a raw
string match). Each filter/action currently occurs in 3 places (define,
parse, use) that have to match up so adding the remaining dozen or so
posix-required filters would be fiddly...

But backing up to a couple more fundamental issues: traversal has subtle
requirements in the standard that dirtree actually has the
infrastructure to get right, but it's subtle. If you traverse on ".",
"./", and ".//" all three of them have explicitly requirements (you
should get "./blah", "./blah", and ".//blah" respectively. No really.)
So I need to add that to to the test suite...

And then there's -HL, which are sadly still a todo in the "cp" command,
and to make that work i need something like:

static int do_find(struct dirtree *new)
{
  return DIRTREE_RECURSE|((toys.optflags&FLAG_L) ? DIRTREE_SYMFOLLOW : 0);
}

void find_main(void)
{
  // Loop through paths
  for (i = 0; i < len; i++) {
    struct dirtree *new;

    new = dirtree_add_node(0, toys.optargs[i],
      toys.optflags&(FLAG_H|FLAG_L));
    if (new) dirtree_handle_callback(new, do_find);
  }
}

Again, with checks in the test suite.

The other thing I'm looking at is the ( ) -o ( ) logic (-a is a NOP,
it's implied). As far as I can tell this doesn't require evaluation
reordering because there are no options with different precedence, so
you don't need to make trees and stuff. A simple if (!strcmp(s,
"-blah")) { } else if (!strcmp(s, "-blah")) staircase should handle it;
the tokenization pass is an optimization to speed stuff up but probably
premature optimization of something that's not actually a bottleneck.
(Might be slightly easier to implement the parentheses handling as a
recursive function rather than pushing true/false values onto a stack,
but I'm still working through that.)

Anyway, the old one wasn't designed around a reading of the POSIX spec
for this function and the new one is.

A workflow issue I've hit before (with my half-finished dd rewrite) is
that if I do decide to rewrite a pending comand instead of iteratively
cleaning up what's there, I haven't to a good place to _put_ the new one
until it's finished. I don't want to delete the old "sort of works but
the infratructure is wrong" version and replace it with "this is a clear
regression" code at my first couple unfinished stopping points. So I
leave it out of tree until it's almost done and tested, and all anyone
else sees is silence followed by a sudden new unexplained command...

Not sure how to handle that. (The alternative of checking in stopping
points leads people to think that sed.c actually does something and
mdev.c is on par with the busybox version...)

Eh, solution is to finish everything. Working on it... :)

Rob

 1404989121.0