[Toybox] readdir() and deleted files
Rob Landley
rob at landley.net
Fri Nov 8 14:44:19 PST 2024
On 11/8/24 07:45, enh wrote:
> certainly my knee-jerk reaction to the bug was "duh, don't modify the
> directory you're traversing!" but allegedly they can't reproduce with
> glibc/gnu find, which seemed weird.
Alas, I don't get to say that, and out in the wild modifying traversed
directories happens ALL THE TIME.
In _really_ bad cases it's a security exploit, because the naive way to
handle infinite depth rm -rf (required by posix, and by reality with
"while mkdir a; do mv b a; mv a b; done" doing an infinite depth inode
forkbomb without ever TRAVERSING more than one level) is to cd subdir
and then cd .. on the way out counting your depth, and if root's doing
that to a world writeable directory which somebody else does a "mv" of a
directory partway down but above where it's currently travrsing, and
they cd .. out into somewhere unexpected and keep going...)
I haven't implemented it yet (dirtree is still keeping a filehandle open
at each level so you're generally limited to 1000 directories deep
before openat() fails), but my PLAN for handling infinite recursion
depth is to openat("..") and compare the dev/ino saved in the parent
dirtree node's stat field, and if they don't match either abort or drill
down from the top again as far as we can back towards where we were, and
continue from there.
Both are nontrivial, in part because I haven't got a way of signalling
preference (WHEN should it abort vs continue?), and resync could screw
up callbacks that don't get COMEAGAIN and similar they were expecting,
and because if it's something like "find" traversing then I want to
avoid duplicates when I re-traverse the parent directory, so I have to
scan and record all entries in each parent directory before descending
into any of its subdirectories (that way I avoid doing readdir() again
at all, just stat and confirm it's the same dev/ino), which can get
memory expensive...
And then, of course, I've long wanted to do "works like an archiver"
versions of mke2fs/mkfatfs/mkisofs/mksquashfs where I can go something
like "mkfatfs subdir | gzip" which means I can neither mmap nor seek on
the output but have to produce it all in sequence, which requires
reading ahead all the data before writing any of it. (Which dirtree
should now have all the plumbing for with the breadth first stuff.) That
sort of thing is inherently two passes: collecting metadata in pass 1,
and reading/writing data in pass 2, and there's no guarantee nothing's
changed between them so... cope? Pad with zeroes I guess? (And you have
to scan file contents for sparseness in the "metadata" pass because that
can affect later size offsets...)
> i'm happy to keep calling this user error, though, and your example
> certainly convinces me not to touch libc!
That's why I was checking BOTH dev/ino for in-band failure detection on
the stat structure. Which I _think_ is reliable. (And if there's a
system out there where it's not true: A) at least it's way more of a
lottery ticket, B) pilot error on their part, it works for Linux.)
Rob
More information about the Toybox
mailing list