[Toybox] find(1) -name vs -wholename

Rob Landley rob at landley.net
Mon Mar 4 18:17:24 PST 2024


On 3/4/24 18:03, enh wrote:
> On Mon, Mar 4, 2024 at 3:31 PM Rob Landley <rob at landley.net> wrote:
>>
>> On 3/4/24 12:19, enh via Toybox wrote:
>> > obviously the patch is trivial, but i can't think of an existing
>> > toybox tool that has one of these "you're holding it wrong" errors,
>> > but this is one that i do find useful:
>>
>> I thought there was one in tar but couldn't find it. Gzip has "need -f to read TTY".
>>
>> I'm not conceptually against "this CAN'T work" errors. (Except this isn't an
>> error, it prints to stderr and then exits with 0. Seems a bit indecisive...)
> 
> /facepalm
> 
> (probably no-one noticed yet because it's most likely to be hit
> interactively. still seems like a bug though!)

Well, a warning isn't exactly an error...?

It's design-level indecision. Is this a problem or not?

>> > where i'm left wondering why
>> > it can't just do the right thing... since `/` is illegal in a POSIX
>> > name, what other interpretation could there be? but, still, better
>> > than nothing.)
>>
>> I'd be happy to do the right thing instead? Fairly minor code change either way.

Thinking about it more, the "right thing" might be for -name to match the
trailing whole entries, so if you "find toybox -name pending/git.c" it could
come up with "toybox/toys/pending/git.c".

Or in my case, my ~/toybox work directory has... 21 toybox repo directories
under it (basically instead of branches+stash), so "find . -name toys/*/git.c"
could find all the instances of that file using a more shell-like expansion
syntax, even if there are subdirectories in the way (such as, real example,
~/toybox/android/toybox/toys/pending/git.c).

This is getting us away from "minor code change", though. But I think I already
have code for this in... tar.c maybe? All that --anchored stuff calling
do_filter() calling fnmatch. Not that hard to do it again with a model at hand. :)

> yeah, i'm not sure why coreutils doesn't do that --- perhaps to avoid
> the question of whether `-name bits/syscall.h` means `-wholename
> .*bits/sycalls.h` or `-wholename .*/bits/syscall.h`?
> 
> (`-path` with a trailing `/` is a similarly unhelpful sharp corner.)

My brain is FRIED by packing to move, and I would need test cases with proposed
output to follow the distinctions you're making there.

I'm also a little confused about what "-wholename" is for given that the shell
already does path expansion? I guess it's so you don't get your wildcards back
as a result when there's no match? Hmmm... ah, I see, * can eat slashes here,
and the shell won't do that.

And... what does -path do again? In the toybox directory, none of these produce
a result with the host find or toybox find:

  find . -path pending
  find . -path toys/pending
  find . -path toys/pending/ip.c

It's been some years since I implemented this stuff. What do the tests do...

  $ mkdir dir
  $ touch dir/file
  $ find . -wholename 'dir*e'
  find: ‘./dirtest/subdir’: Permission denied
  $ find dir -wholename 'dir*e'
  dir/file

Ah, the ./ at the start is preventing both -wholename and -path from matching,
because of course.

>> We could even ping the coreutils guys about that, since they recently agreed to
>> add -x when I grumped at them. (I'm moving house! It's very stressful!) Speaking
>> of, I just remembered to ping busybox list about that... Alas, still no cut -DF
>> in coreutils, last I checked...
> 
> tbh, since starting to read the coreutils list i'm _less_ convinced
> that anyone really thinks about anything,

Coreutils is gnu.

> and especially not about interactions between things.

It's very gnu.

Your expectations weren't low enough. There's a reason I started poking at
busybox back in 2002. The gnu project was announced in 1983, and gnu/hurd
remains unusable essentially today.

> (i saw the -x,--swap thread but didn't
> have the energy to point out the -x,--exchange would have been quite a
> bit less unclear...)

I know, but I only cared about A) the short option, B) having mv finally able to
call that rename() functionality the linux kernel added ten years ago. (Which
came up again recently with a patch implementing atomic exchange in the VFAT
driver.)

Letting the --longopt smell like that particular gatekeeper is fine with me, I
never voluntarily use them, and it's sort of the opposite of an ablative duck:

  https://bwiggs.com/notebook/queens-duck/

He got to keep --swap. He _wanted_ it to be called --swap. Which meant he wanted
the feature to go in, because otherwise it couldn't be called --swap.

>> Rob

Rob

P.S. Oddly enough, while Linux beat gnu because gnu sucked, Linux beat BSD using
the standard disruptive technology playbook Clayton Christensen described in the
Innovator's Dilemma in 1997.

The problem FreeBSD had back in 1991 was it was big iron tech ported down to PCs
and didn't fit comfortably in the smaller space. A bit like IBM's OS/2, which
was full of System Object Model implementations of Common Object Request Broker
Architecture, in triplicate. Disruptive technologies start at the bottom and
expand upmarket, not down.

Linux started life on a cheap white box PC a college student bought with
birthday money and the proceeds from selling his sinclair QL (on which he had
previously written his own multitasking operating system, so Linux was his
SECOND GO at it), and Linux was a cheap white box OS also assembled from
eclectic parts. Back when Alan Cox left the Amiga for PC unix, he picked Linux
instead of FreeBSD because (at the time) Linux didn't require an expensive FPU
coprocessor and FreeBSD wouldn't boot without one. Linux never assumed you could
afford optional components, because Linus couldn't. Linux exploded in 1993 when
NSF changed the IETF AUP to allow commercial use of the internet because an old
386 in a closet could be a web server no matter how crappy the cast-off parts
you slapped together were, while BSD had minimum system requirements meaning you
needed a budget to buy new hardware. By the time anybody was willing to spend
MONEY on this web thing, Linux had been running the company's web page for 18
months.

Linux was a standalone source package with a collection of random detritus for
userspace, hand-rolled or ported from minix, bsd, gnu, it used a shell script
for "echo"... Seriously, get a feel for what 1991 and 1992 were like from this:

https://landley.net/history/mirror/linux/1991.html
https://landley.net/history/mirror/linux/1992.html

Meanwhile FreeBSD was an integrated whole with all of userspace checked into the
same CVS repository as the kernel, maintained by the same team, built as a
single seamless unit. This was a DOWNSIDE. A project like busybox or uclibc
trying to emerge under BSD was like trying to install netscape on Windows back
when Internet Explorer was part of the base OS. Linux was a whitebox PC
assembled out of parts, BSD was like a Macintosh: all Official Components were
expected to always be there on every system, and replacing anything with strange
third party aftermarket knockoffs was not supported. (I mean, technically you
could, but WHY? It was weird.)


More information about the Toybox mailing list