[Toybox] [PATCH] Support diff --no-dereference

enh enh at google.com
Mon Aug 26 14:30:20 PDT 2024


...but ignoring the bikeshed of "should we do something about all the
_other_ readlink callers" --- ping on this specific patch :-)

On Fri, Aug 23, 2024 at 3:15 PM Rob Landley <rob at landley.net> wrote:
>
>
>
> On 8/23/24 06:33, enh wrote:
> > On Fri, Aug 23, 2024 at 3:32 AM Rob Landley <rob at landley.net> wrote:
> >>
> >> On 8/22/24 08:23, enh via Toybox wrote:
> >> > +  int s = sizeof(toybuf)/2;
> >> > +
> >> > +  TT.is_symlink = 1;
> >> > +  for (i = 0; i < 2; i++) {
> >> > +    TT.link[i].name = toybuf + i * s;
> >> > +    TT.link[i].len = readlink0(files[i], TT.link[i].name, s);
> >> >
> >> > should probably use xreadlink() and free() instead? toybuf is big
> >> > enough to be safe for _one_ path on linux, but not two.
> >>
> >> Eh, no arbitrary buffer is "enough". PATH_MAX is slowly getting phased out, and
> >> even posix-2008 requires arbitrary depth for "rm" because:
> >>
> >> $ for ((i=0;i<10000;i++)); do mv a b; mkdir a; mv b a/a; done
> >
> > i thought in practice though linux couldn't cope with longer paths?
>
> It has to, because you can retroactively create them with "mv" and "mount", and
> if it hasn't loaded in the directory contents to populate the dentry cache of
> all descendents before doing the mv it doesn't know how deep the rabbit hole goes.
>
> Now _symlinks_ are janky. The kernel has a silly "this path contained X many
> symlinks going down it, and even though this is a finite non-recursive process
> we error out at an arbitrary limit anyway", but that's not related to directory
> depth.
>
> There's a vfs limit of 255 per directory name, but I don't think there's a limit
> on how many directories deep you can go, barring inode exhaustion, or maybe the
> dentry cache running out of memory might behave badly...
>
> > if not, every call to readlink0()/readlinkat0() looks problematic.
>
> They're not perfect, no. It's a bit like the ps.c code that truncates how long a
> command line it'll show: nobody's complained. :)
>
> I'm usually more concerned with properly supporting spaces and newlines in path
> names than length limits, because pathological filesystem depth really doesn't
> come up much. If somebody figures out how to make it an attack vector, there's a
> couple other places that bump up the todo list then. (I don't think I've
> actually IMPLEMENTED the dirtree traversal that avoids filehandle exhaustion
> yet. Talked about it here several times, but haven't DONE it yet...)
>
> > plus there's one lone call to readlink() in main.c that's using libbuf
> > rather than toybuf (although "you put your toybox binary symlinks more
> > than 4096 bytes deep" sounds like "please stop shooting at your feet"
> > to me!).
>
> libbuf and toybuf are the same size. The rule is toybuf is available for
> commands and libbuf is for library code, main.c is library code. (Library code
> modifying toybuf behind a command's back is verbotten.)
>
> >> Anyway, a 256 byte buffer would handle 99% of the cases, and this is likely to
> >> handle all the ones we'd ever actually hit. If we want a proper fix what we need
> >> is to teach readlink0() to dynamically allocate a buffer, and THAT has the
> >> problem that the kernel API returns "error" for buffer not big enough without
> >> telling us how big a buffer we NEED, so you have to iteratively guess...
> >
> > that's what xreadlink() already does :-)
>
> Which is _conceptually_ funky because xmalloc() calls xexit() when the
> allocation fails, but readlink() can fail due to file not found and friends, and
> the LOGICAL thing to do is have xreadlink() return NULL when it couldn't
> readlink but fail in the "should never happen and we can't recover if it does"
> malloc() failure case, and a single "x" prefix doesn't distinguish between
> different failure cases...
>
> But complicating the naming convention doesn't make stuff immediately obvious to
> the reader because they won't remember what the complicated convention means,
> so... lump it? I forget what I did, which is the problem in a nutshell.
>
> Rob


More information about the Toybox mailing list