[Toybox] readdir() and deleted files

Rob Landley rob at landley.net
Fri Nov 8 03:02:23 PST 2024


On 11/7/24 11:47, enh wrote:
> looking at
> https://lore.kernel.org/lkml/20241107005831.15434-1-elsk@google.com/T/#u i
> note that there _is_ a difference between glibc readdir() and bionic/musl
> readdir() --- glibc explicitly doesn't report any entry where d_ino is 0.

Is 0 a valid inode number in some filesystem?

They were talking about find, and toybox find does DIRTREE_STATLESS and 
then does its own stat discarding ENOENT (line 224 of find.c) to avoid 
the race hole, but possibly dirtree itself should be handling that 
internally?

> i couldn't find any documentation on this, so i don't know whether this is
> some undocumented thing that linux expects libc to do or what.

If glibc is doing it, 80% chance it's 
https://www.psychologytoday.com/us/blog/thinking-makes-it-so/201402/the-pot-roast-principle 
at best, but let's see... Nope, looks like it got reverted in 2022:

https://sourceware.org/git/?p=glibc.git;a=commit;h=766b73768b29

Because 0 is a valid inode in xfs. Go figure.

> but i
> thought you might be interested, since two of your five libcs aren't doing
> it, and you may or may not want to do it in your wrapper?

Right now find.c is manually compensating for stat failure due to 
deletion while traversing, because it came up. And ls.c is also doing 
it, and there's a use in sh.c in pending. Meanwhile ps is instead using 
DIRTREE_SHUTUP.

> (i'm _assuming_
> the bsd/macOS kernels just do this themselves rather than leaving it as
> userspace's problem, since they tend to lean that way. though i'm not
> really sure what exact state "returned from getdents64(2) but with d_ino 0"
> actually _means_, so it's possible there's a legit use to reporting this to
> userspace?)

I don't entirely understand what the objection is, but I haven't exactly 
had the calmest week of regular sleep schedule...

If you're suggesting that dirtree should handle both the SHUTUP and 
STATLESS cases internally/automatically... quite possibly? The question 
would be why it isn't already.

In the case of STATLESS I think it was because "readdir can see it but 
stat can't access it" is some permission flag combination we have to 
deal with and probably have a test for in the testsuite. (A directory 
that's +r but -x or vice versa.) Both commands need to find and display 
things it can't tell you anything _about_ other than that it exists (and 
the type info returned by readdir itself about whether it's a directory 
or symlink or whatnot).

In the case of SHUTUP, that tells dirtree_add_node() not to report any 
errno to stderr. It's already doing a fallback where it tries a second 
stat with AT_SYMLINK_NOFOLLOW if it got ENOENT the first time, and in 
fact STATLESS doesn't tell it not to TRY to stat, it just says return 
the node with memset(0) stat even if you couldn't stat it (with the type 
info from readdir patched in to the appropriate field). If you JUST give 
it SHUTUP then it discards the node when it couldn't stat it (returning 
NULL) and dirtree_recurse() iterates right past that (the continue; on 
lib/dirtree.c line 184).

So it looks to me like the currently behavior's probably right-ish?

Rob


More information about the Toybox mailing list