[Toybox] readdir() and deleted files
Rob Landley
rob at landley.net
Fri Nov 8 03:02:23 PST 2024
On 11/7/24 11:47, enh wrote:
> looking at
> https://lore.kernel.org/lkml/20241107005831.15434-1-elsk@google.com/T/#u i
> note that there _is_ a difference between glibc readdir() and bionic/musl
> readdir() --- glibc explicitly doesn't report any entry where d_ino is 0.
Is 0 a valid inode number in some filesystem?
They were talking about find, and toybox find does DIRTREE_STATLESS and
then does its own stat discarding ENOENT (line 224 of find.c) to avoid
the race hole, but possibly dirtree itself should be handling that
internally?
> i couldn't find any documentation on this, so i don't know whether this is
> some undocumented thing that linux expects libc to do or what.
If glibc is doing it, 80% chance it's
https://www.psychologytoday.com/us/blog/thinking-makes-it-so/201402/the-pot-roast-principle
at best, but let's see... Nope, looks like it got reverted in 2022:
https://sourceware.org/git/?p=glibc.git;a=commit;h=766b73768b29
Because 0 is a valid inode in xfs. Go figure.
> but i
> thought you might be interested, since two of your five libcs aren't doing
> it, and you may or may not want to do it in your wrapper?
Right now find.c is manually compensating for stat failure due to
deletion while traversing, because it came up. And ls.c is also doing
it, and there's a use in sh.c in pending. Meanwhile ps is instead using
DIRTREE_SHUTUP.
> (i'm _assuming_
> the bsd/macOS kernels just do this themselves rather than leaving it as
> userspace's problem, since they tend to lean that way. though i'm not
> really sure what exact state "returned from getdents64(2) but with d_ino 0"
> actually _means_, so it's possible there's a legit use to reporting this to
> userspace?)
I don't entirely understand what the objection is, but I haven't exactly
had the calmest week of regular sleep schedule...
If you're suggesting that dirtree should handle both the SHUTUP and
STATLESS cases internally/automatically... quite possibly? The question
would be why it isn't already.
In the case of STATLESS I think it was because "readdir can see it but
stat can't access it" is some permission flag combination we have to
deal with and probably have a test for in the testsuite. (A directory
that's +r but -x or vice versa.) Both commands need to find and display
things it can't tell you anything _about_ other than that it exists (and
the type info returned by readdir itself about whether it's a directory
or symlink or whatnot).
In the case of SHUTUP, that tells dirtree_add_node() not to report any
errno to stderr. It's already doing a fallback where it tries a second
stat with AT_SYMLINK_NOFOLLOW if it got ENOENT the first time, and in
fact STATLESS doesn't tell it not to TRY to stat, it just says return
the node with memset(0) stat even if you couldn't stat it (with the type
info from readdir patched in to the appropriate field). If you JUST give
it SHUTUP then it discards the node when it couldn't stat it (returning
NULL) and dirtree_recurse() iterates right past that (the continue; on
lib/dirtree.c line 184).
So it looks to me like the currently behavior's probably right-ish?
Rob
More information about the Toybox
mailing list