[Toybox] Bug at rm toy.

Sun Sep 14 23:24:14 PDT 2014

On 09/14/14 06:49, luckboy at vp.pl wrote:
> W dniu 14.09.2014 o 05:04, Rob Landley pisze:
>> On 09/11/14 13:55, luckboy at vp.pl wrote:
>>> I found some bug at the rm toy. The rm toy can't remove link to
>>> non-existent file with the -f option. Patch fixes this bug:
>>> https://github.com/luckboy/toyroot/blob/master/patch/toybox-0.4.9-fixed-rm-bug.patch
>>>
>> Thanks. I checked in a fix, does this work for you?
>>
>> http://landley.net/hg/toybox/rev/b748127e092e
>>
>> Rob
>> _______________________________________________
>> Toybox mailing list
>> Toybox at lists.landley.net
>> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>>
> 
> I didn't check in this fix but I think that replacement of access() is
> bad idea.
> The unlink function can remove directory in some systems for superuser.

unlink() shouldn't remove directories, rmdir() should remove
directories. See the EISDIR error in:

http://man7.org/linux/man-pages/man2/unlink.2.html

Posix allows unlink to remove directories in implementation-defined
behavior:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/unlink.html

But Linux doesn't do it. The man page in my nebook's ubuntu install says
this behavior was introduced during the Linux 2.1 development cycle, so
it's been like that for 16 years:

  EISDIR pathname refers to a directory. (This is the non-POSIX  value
         returned by Linux since 2.1.132.)

(2.1.132 was december 1998.)

I confirmed this to still be the case even for root:

$ sudo rm -f ./sub2
rm: cannot remove `./sub2': Is a directory

Note that it would be silly for unlink() to remove directories _and_ for
rmdir() to be specified as a separate system call. And yet posix
specifies rmdir(), and carefully requires its use when dealing with
directories in things like rm. So this is a case of them carefully
stepping around obsolete historical implementations in the spec by
allowing this old behavior to be technically compliant.

Can you point to an existing real-world system where unlink removes
directories?

> It will be better if these check whether the file isn't symbolic link.
> If this file is symbolic
> link, these remove this symbolic link (no reference), otherwise these
> invoke the dirtree_read function.

I try to avoid unnecessary special cases. For example: you can --bind
mount a file. It's possible the file it's bind mounted to lived in a
filesystem on a block device that got hot-unplugged and is thus an empty
zombie mount. There's a long thread on linux-kernel earlier this year
about automatically unmounting things on unlink in certain cases. How
does this affect your test case? It's possible unlink() would do the
right thing (automatically removing a stale --bind mount that access()
would consider "does not exist") where a special case for symlink
wouldn't. Is this the only case where that might happen? I have idea.
I'm trying for generic behavior that will do the right thing (or at
least behave in a simple, understandable way) regardless of how the
system changes in the future.

> On the other hand, why did you check use the access function at the
> rm_main function?

Because "rm -f does_not_exist" should not be an error, and the dirtree
code will produce an internal warning message if you tell it to recurse
on something that doesn't exist, so we can't fix that in the callback
because the error is produced before we get to the callback. Thus we
have to avoid it up front.

> Is it related to race at the rm toy? If you must use access(), you can
> use my patch or check whether the file is symbolic link after access().

The race was that if the file went away between the access() check and
the call to dirtree, then dirtree would issue a warning about the file
not having been there (and set the error return code so rm would exit
with an error) even though rm -f doesnotexist should not be an error.

Using unlink actually narrows that race: it can't happen for normal
files anymore. (It can still happen for directories.) That was one
argument in favor for using unlink. Another was that it doesn't increase
the code size.

The code isn't perfect, but it's the best I could come up with based on
my understanding of how Linux works. If you can point me at a system
where this is wrong...

Note: the failure mode you're talking about is that rm -f emptydir would
remove the directory without the -r option. (It would presumably only
remove _empty_ directories, or we're back into non-posix compliance.)
While this is technically noncompliant with rm spec 2(a), it doesn't
strike me as a huge failure mode either...

Rob

 1410762254.0