[Toybox] Towards find cleanup

Wed Apr 10 21:24:52 PDT 2013

On 04/10/2013 01:55:58 PM, Tim Bird wrote:
> > - dump_node is not used anywhere
> It used to be used in debug code that was recently removed
> by Rob.  It's a bit of a pain to examine the expression
> tree in a debugger,

Simplifying the expression tree is something I need to look at.

> so an ascii dump routine is handy.
> Because it is unused, it evaporates because of compiler
> optimizations.  I don't know if Rob left it around intentionally
> or not, but I hate to see it go while 'find' is still under
> development.  (Once we're all happy with the find implementation,
> of course, you can feel free to remove the lifeboats, just like
> the Titanic. :-)
> 
> In general, I don't know if Rob has a policy for having debug
> code available for use, but not currently used.

I try not to _need_ it.

I admit that there's a bit of debug scaffolding in patch.c and  
lib/args.c, but that's a sign of failure on my part. I periodically go  
back and look at this to see if I can think of a way to simplify that  
code. (Both were hard enough for me to implement that there's still  
some scar tissue, and both have pending todo items giving me an excuse  
to defer further cleanup until I revisit them. Patch needs to support  
git mv syntax, for example...)

Sometimes I just have to say "I'm not smart enough to come up with less  
horrible way to do this right now" and move on, but I periodically  
revisit those to see if I get lucky.

> > - exec_buf is unused
> I think I put this there in anticipation of the work required
> to properly support exec.  Right now, the code doesn't
> scan the args and replace '{}' with the file or directory
> name being evaluated.  I think I planned to use exec_buf
> for this substitution, rather than allocate something on
> every evaluation.  But that's a detail that whoever finishes
> the exec functionality can work out.

This is the sort of thing toybuf[] is there for: a roughly page sized  
chunk of memory you don't have to allocate.

I've subdivided toybuf for various uses, such as ls.c function  
listfiles() which is more opaque black magic than I like but it's doing  
a fiddly thing in a compact way that's easy to read is _hard_.

Then again arbitrary size limitations aren't good. I need to clean  
those out of toysh when I restart work on that. It's only acceptable in  
ls because the things I'm using it for have existing limits anyway: the  
maximum filename length is limited to 255 in the vfs (rounded up to  
260, a multiple of 4), and then the remainder of the 4k, divided by  
sizeof(long), is going to be at least 500, which would be a screen  
width of at _least_ 1000 characters, and that's just uncomfortable for  
humans (the norm is still in the ballpark of 80, that's an order of  
magnitude more, you can't figure out what line you're on when you jump  
back to the start when it's anywhere near that big)... so the fixed  
size buffer is overkill in that case. :)

> > - Replace SUCCESS with 1 and simplify code accordingly
> > - a==0 -> !a
> I'm OK with most of these, when 0 is being treated as
> failure and !0 as success (all the string operations).

That's how tests work in C. :)

> There is at least one place where I'm actually checking
> that something is equal to the value 0 (that is, it's
> a value comparison and not a boolean).  I know that's a
> bit picky, and they compile the same, but mentally I
> prefer to use numbers for number comparisons and ! for
> boolan not.

*shrug*

> > -char exec_buf[1024];
> > -
> See note above.  It might have been intended for arg substitution
> for -exec.

Both times you use toybuf it's to store a copy of a string you have a  
constant version of. (Environment space never gets freed, and "." is a  
string constant.) so you could just have a pointer in your GLOBALS() or  
some such that gets initialized or is otherwise NULL.

[wanders away to go fix that... ok back now.]

> > -			printf("unknown node type (%d) ", node->op);
> > -			break;
> > -	}
> > -}
> > -
> See above.  It's going to be harder to debug filter expressions
> without the ability to dump the filter nodes in ascii.

The fix to that is to simplify filter expressions. Working up to that...

> >  	pid = fork();
> > -	if (pid==0) {
> > +	if (!pid) {
> I prefer pid==0 here.  !pid has the same effect in the compiler,
> but I'm not doing a boolean compare, I'm checking for the special  
> value
> 0 from fork.  IMHO it's easier to read with the 0 present.

I lean towards "0 is 0" myself.

(And I redid this whole area in the most recent commit.)

> >  	if (filter->op==OP_OR) {
> > -		result = evaluate(filter->next, node, fnext);
> > -		result2 = evaluate(*fnext, node, fnext);
> > -		if (result) {
> > -			return result;
> > -		} else {
> > -			if (result2) {
> > -				return result2;
> > -			} else {
> > -				return 0;
> > -			}
> > -		}
> > +		return evaluate(filter->next, node, fnext) ||  
> evaluate(*fnext, node, fnext);
> This will have different side effects than the original code.
> The original code evaluates both sides, and then returns the 'or'  
> operation
> on the results.  The new one will (of course) do short-circuit  
> evaluation of
> the '||',  meaning that sometimes the second node won't be evaluated.
> I believe that at one time, I actually had it coded that way, but for  
> some
> reason or other I went back to forcing the evaluation of both nodes.
> Unfortunately, I can't remember the reason why I went back.  I would  
> have expected
> regular find to do short-circuit evaluation of expressions with 'or'  
> operations,
> so I'd be happy if this were correct.  I just have a nagging feeling  
> that
> there was some reason I couldn't do it this way.

This is why I try to fill out the test suite as I go along. Because  
otherwise I wind up in exactly this sort of situation...

> Do you know if regular find does short-circuit evaluation?
> What would regular find do with
>   'find . -exec test1_with_side_effects {} \; -o -exec  
> test2_with_side_effects {} \;'
> ?

   $ find . -exec echo one "{}" \; -o exec echo two "{}" \;
   find: paths must precede expression: exec

Um... huh?

> >  	result = evaluate(filter_root, node, &junk);
> > -	if (result & SUCCESS & !have_action) {
> > +	if (result & !have_action) {
> I have no explanation for my weird code.  I think it was
> left over from when SUCCESS was something besides 1, and I had
> to do an actual value compare here.  But the old code is
> clearly just broken now.  Thanks for the fix.

There's never a need to apologize for it. I have thinkos all the time  
(3am sleep deprived coding, fun for the whole family). It's better than  
not having it, and we're fixing it. :)

> > @@ -570,7 +430,7 @@
> >
> >  	/* FIXTHIS - parse actions, if present */
> >
> > -	if (toybuf[0]==0) {
> > +	if (!toybuf[0]) {
> I prefer the comparison with 0 here.  The 0 value
> at the begging of the toybuf is not a boolean (in my mind anyway)
> but a special value.  But that's just me.  It's all just
> mental model, and if the '!' is easier to read for others
> I'm fine with this change.

I do !*toybuf but that's just me. It's largely the same to the  
compiler. :)

Rob