[Toybox] [PATCH 1/1] add pathchk

Rob Landley rob at landley.net
Sat May 27 12:24:19 PDT 2017


On 05/26/2017 11:21 AM, Ilya Kuzmich wrote:
> Signed-off-by: Ilya Kuzmich <ilya.kuzmich at gmail.com>
> ---
>  tests/pathchk.test   | 85 ++++++++++++++++++++++++++++++++++++++++++++++
>  toys/posix/pathchk.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 180 insertions(+)
>  create mode 100644 tests/pathchk.test
>  create mode 100644 toys/posix/pathchk.c

Do you actually have a use case for this, or are you implementing it
because nobody has yet and it's in posix?

This seems like a competent implementation of this command, and I see it
is in the roadmap, but I'd like to know _why_ you want it. I've mentally
been lumping it in with the posix "sum" and "compress" commands as a
thing that's still in posix yet clearly obsolete, which there _might_
still be users for, but I'd wait for one to show up before implementing
it. It's not the outright veto things like sccs and qselect get for
being zombies from the 1970's, but I've run an awful lot of scripts and
package builds that never used this command.

Busybox never implemented this command. (Yes ubuntu installs it, but
they install "sum" too. "compress" and "pax" aren't there for historical
reasons: patents on the first, and a Linux vs Solaris identity thing for
the second.)

UTF8 is a big deal these days. Filenames starting with - are why all
toybox commands support -- and aren't really stranger than filenames
with spaces in them. The value of POSIX_PATH_MAX is 256, a hilariously
low value left over from the PDP-11. (The PATH_MAX in linux/limits.h is
4096 but that's legacy value, glibc doesn't define one at all anymore.
You can always create a deeper path with "cd longpath; mv ~/dirname ."
behind the OS's back (unless the OS wants to traverse all of ~/dirname's
children every time it does a mv) and even posix says rm -rf has to just
cope with it.)

On Linux, the only invalid characters in the Linux VHS are NULL and
forward slash; anything else is explicitly allowed, and yes that
includes invalid utf8 sequences:

  http://yarchive.net/comp/linux/utf8.html

If you're trying to figure out what a specific filesystem supports in
directory du jour, some of them are case sensitive so you get magic
aliasing and what's valid depends on what ELSE is in the directory. (You
can't mkdir "blah" if "Blah" exists and belongs to another user as chmod
700). Some of the older filesystems have mount-time selectable locale
encodings they translate behind your back.

I terms of having the OS do this work for you, "readlink -m" and
"readlink -f" exist, both of which have -q and the ability to return
error without writing anything to the filesystem. That won't catch the
last path component being longer than 255 chars (the Linux VFS limit on
path components) but I could add a check for it there.

I moved hostid to examples because it's _not_ in posix, but this one is
so _if_ it goes in that's the place for it. But if I do merge it I'm
strongly tempted to make it "default n"...

Anyway, what's your motivation for submitting this command?

Rob



More information about the Toybox mailing list