[Toybox] find(1) -name vs -wholename

enh enh at google.com
Tue Mar 5 08:25:36 PST 2024


On Mon, Mar 4, 2024 at 6:09 PM Rob Landley <rob at landley.net> wrote:
>
> On 3/4/24 18:03, enh wrote:
> > On Mon, Mar 4, 2024 at 3:31 PM Rob Landley <rob at landley.net> wrote:
> >>
> >> On 3/4/24 12:19, enh via Toybox wrote:
> >> > obviously the patch is trivial, but i can't think of an existing
> >> > toybox tool that has one of these "you're holding it wrong" errors,
> >> > but this is one that i do find useful:
> >>
> >> I thought there was one in tar but couldn't find it. Gzip has "need -f to read TTY".
> >>
> >> I'm not conceptually against "this CAN'T work" errors. (Except this isn't an
> >> error, it prints to stderr and then exits with 0. Seems a bit indecisive...)
> >
> > /facepalm
> >
> > (probably no-one noticed yet because it's most likely to be hit
> > interactively. still seems like a bug though!)
>
> Well, a warning isn't exactly an error...?
>
> It's design-level indecision. Is this a problem or not?
>
> >> > where i'm left wondering why
> >> > it can't just do the right thing... since `/` is illegal in a POSIX
> >> > name, what other interpretation could there be? but, still, better
> >> > than nothing.)
> >>
> >> I'd be happy to do the right thing instead? Fairly minor code change either way.
>
> Thinking about it more, the "right thing" might be for -name to match the
> trailing whole entries, so if you "find toybox -name pending/git.c" it could
> come up with "toybox/toys/pending/git.c".

yeah, that's what i always assume it does until it doesn't work. (i've
never really seen the use for -path given its various limitations, and
although there's value to -wholename, that's a terrible name for "the
regex version of -name, but on the whole path".)

> Or in my case, my ~/toybox work directory has... 21 toybox repo directories
> under it (basically instead of branches+stash), so "find . -name toys/*/git.c"
> could find all the instances of that file using a more shell-like expansion
> syntax, even if there are subdirectories in the way (such as, real example,
> ~/toybox/android/toybox/toys/pending/git.c).
>
> This is getting us away from "minor code change", though.

yeah, that's why i was wondering whether we should just do the warning :-(

> But I think I already
> have code for this in... tar.c maybe? All that --anchored stuff calling
> do_filter() calling fnmatch. Not that hard to do it again with a model at hand. :)
>
> > yeah, i'm not sure why coreutils doesn't do that --- perhaps to avoid
> > the question of whether `-name bits/syscall.h` means `-wholename
> > .*bits/sycalls.h` or `-wholename .*/bits/syscall.h`?
> >
> > (`-path` with a trailing `/` is a similarly unhelpful sharp corner.)
>
> My brain is FRIED by packing to move, and I would need test cases with proposed
> output to follow the distinctions you're making there.
>
> I'm also a little confused about what "-wholename" is for given that the shell
> already does path expansion? I guess it's so you don't get your wildcards back
> as a result when there's no match? Hmmm... ah, I see, * can eat slashes here,
> and the shell won't do that.
>
> And... what does -path do again? In the toybox directory, none of these produce
> a result with the host find or toybox find:
>
>   find . -path pending
>   find . -path toys/pending
>   find . -path toys/pending/ip.c
>
> It's been some years since I implemented this stuff. What do the tests do...
>
>   $ mkdir dir
>   $ touch dir/file
>   $ find . -wholename 'dir*e'
>   find: ‘./dirtest/subdir’: Permission denied
>   $ find dir -wholename 'dir*e'
>   dir/file
>
> Ah, the ./ at the start is preventing both -wholename and -path from matching,
> because of course.
>
> >> We could even ping the coreutils guys about that, since they recently agreed to
> >> add -x when I grumped at them. (I'm moving house! It's very stressful!) Speaking
> >> of, I just remembered to ping busybox list about that... Alas, still no cut -DF
> >> in coreutils, last I checked...
> >
> > tbh, since starting to read the coreutils list i'm _less_ convinced
> > that anyone really thinks about anything,
>
> Coreutils is gnu.
>
> > and especially not about interactions between things.
>
> It's very gnu.
>
> Your expectations weren't low enough. There's a reason I started poking at
> busybox back in 2002. The gnu project was announced in 1983, and gnu/hurd
> remains unusable essentially today.
>
> > (i saw the -x,--swap thread but didn't
> > have the energy to point out the -x,--exchange would have been quite a
> > bit less unclear...)
>
> I know, but I only cared about A) the short option, B) having mv finally able to
> call that rename() functionality the linux kernel added ten years ago. (Which
> came up again recently with a patch implementing atomic exchange in the VFAT
> driver.)
>
> Letting the --longopt smell like that particular gatekeeper is fine with me, I
> never voluntarily use them, and it's sort of the opposite of an ablative duck:
>
>   https://bwiggs.com/notebook/queens-duck/
>
> He got to keep --swap. He _wanted_ it to be called --swap. Which meant he wanted
> the feature to go in, because otherwise it couldn't be called --swap.
>
> >> Rob
>
> Rob
>
> P.S. Oddly enough, while Linux beat gnu because gnu sucked, Linux beat BSD using
> the standard disruptive technology playbook Clayton Christensen described in the
> Innovator's Dilemma in 1997.
>
> The problem FreeBSD had back in 1991 was it was big iron tech ported down to PCs
> and didn't fit comfortably in the smaller space. A bit like IBM's OS/2, which
> was full of System Object Model implementations of Common Object Request Broker
> Architecture, in triplicate. Disruptive technologies start at the bottom and
> expand upmarket, not down.
>
> Linux started life on a cheap white box PC a college student bought with
> birthday money and the proceeds from selling his sinclair QL (on which he had
> previously written his own multitasking operating system, so Linux was his
> SECOND GO at it), and Linux was a cheap white box OS also assembled from
> eclectic parts. Back when Alan Cox left the Amiga for PC unix, he picked Linux
> instead of FreeBSD because (at the time) Linux didn't require an expensive FPU
> coprocessor and FreeBSD wouldn't boot without one. Linux never assumed you could
> afford optional components, because Linus couldn't. Linux exploded in 1993 when
> NSF changed the IETF AUP to allow commercial use of the internet because an old
> 386 in a closet could be a web server no matter how crappy the cast-off parts
> you slapped together were, while BSD had minimum system requirements meaning you
> needed a budget to buy new hardware. By the time anybody was willing to spend
> MONEY on this web thing, Linux had been running the company's web page for 18
> months.
>
> Linux was a standalone source package with a collection of random detritus for
> userspace, hand-rolled or ported from minix, bsd, gnu, it used a shell script
> for "echo"... Seriously, get a feel for what 1991 and 1992 were like from this:
>
> https://landley.net/history/mirror/linux/1991.html
> https://landley.net/history/mirror/linux/1992.html
>
> Meanwhile FreeBSD was an integrated whole with all of userspace checked into the
> same CVS repository as the kernel, maintained by the same team, built as a
> single seamless unit. This was a DOWNSIDE. A project like busybox or uclibc
> trying to emerge under BSD was like trying to install netscape on Windows back
> when Internet Explorer was part of the base OS. Linux was a whitebox PC
> assembled out of parts, BSD was like a Macintosh: all Official Components were
> expected to always be there on every system, and replacing anything with strange
> third party aftermarket knockoffs was not supported. (I mean, technically you
> could, but WHY? It was weird.)


More information about the Toybox mailing list