[Toybox] Would someone please explain what bash is doing here?

James McMechan james.w.mcmechan at gmail.com
Wed Mar 11 19:41:01 PDT 2020


On Sun, Mar 8, 2020, 12:29 Rob Landley <rob at landley.net> wrote:

> On 3/8/20 11:44 AM, Chet Ramey wrote:
> > On 3/8/20 10:53 AM, Rob Landley wrote:
> >>
> >> I read through the posix shell bits long enough ago it was probably
> SUSv3 rather
> >> than v4, but at the moment I'm taking bash as my standard and just doing
> >> whatever that does.
> >
> > Well, I appreciate that, but there just might be one or two places (or,
> > depending on who you talk to, one or two hundred) where bash diverges
> > from the standard. That might be because of bugs, or backwards
> > compatibility, or the standard having made a dumb decision.
>
> Sure. There are a couple places where "bash does a thing" and what I
> decide to
> do is different, most recent one I hit was:
>
>   $ for i
>   > in one two three
>   > do echo $i;
>   > done
>   one
>   two
>   three
>   $ for i; in one two three; do echo $i; done
>   bash: syntax error near unexpected token `in'
>
> But in general, if the bash userbase hasn't noticed/minded a posix
> discrepancy
> (or outright bug) over the past 20 years, I'm not sure why I should care?
>
> > And you can sometimes get into trouble for following the standard *too*
> > closely;
>
> Linux has "echo -e", posix does not. Guess which one toybox implements?
>
> Meanwhile, all toybox commands support "--", including toybox echo,
> regardless
> of what the host debian one does consistency won out there. Although I
> compromised by doing the xargs-style behavior where option parsing ends
> with the
> first non-option argument so "echo -- hello" prints hello but "echo hello
> --"
> prints "hello --". And yes when Elliot found out you can do "ls hello -l"
> and
> the whole can of worms about "rm *" expanding to -r and such, he suggested
> all
> toybox commands should do that, but I stayed with the more-standard
> behavior of

letting you "ls subdir -l". That discussions on the toybox list somewhere.
> We
> have threads about this sort of corner case all the time, ala:
>
>
> http://lists.landley.net/pipermail/toybox-landley.net/2017-March/008888.html


Well some of that is the glob() function or maybe wordexp(). My thought was
to make it so that when glob() hit a file named "-rf" to expand it would
expand it to "./-rf" to prevent people from being "too clever by half" also
making it so the simple ".*" would not expand to either "." or ".." I don't
recall a glob in toybox but it is usually part of the C library or shell.

A few years back David Wheeler proposed limiting the characters in filenames
https://lwn.net/Articles/686789/ in attempt to fix the issue by
whitelisting valid first, middle, and ending characters.  I thought that
fixing glob() so it is less surprising was a better answer.

I have had stupid characters in file names and if you have not used the
"./-" construct it is hard to get rid of them.

Also <Tab key> expansion of arguments should use the same or similar
logic... so "rm -<Tab>" -> "rm ./-" if there is a file "./-" would help.


>
> Ah, here was that specific thread:
>
>
> http://lists.landley.net/pipermail/toybox-landley.net/2018-October/009796.html
>
> As you can see posix _is_ referenced, but it's not the last word.
>
> > cf. the issues with bash-5.0 treating an unquoted backslash as
> > subject to being removed by pathname expansion. The heated, lengthy
> > discussion that ensued eventually concluded that the plain text of the
> > standard -- which all agreed was what bash-5.0 implemented -- did not
> > reflect shell implementations or the original intent of the standard
> > developers, and that bash-4.4 implements the right way to do it.
>
> I haven't implemented pathname expansion yet. IFS corner cases took a
> couple
> weeks longer than I expected, and I'm still slogging my way through the 8
> gazillion ${stuff} variants.
>
> > That was not the first occurrence of that phenomenon.
>
> I'm still subscribed to the posix list, I just don't read it as closely as
> I
> used to and basically never reply.
>
> >> I should do another pass reading posix afterwards, but after
> >> https://landley.net/notes-2016.html#11-03-2016 I've been much less
> interested in
> >> interacting with the posix committee due to the risk of another
> Schilling, and
> >> have pretty much backed up to
> >> https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/ in much
> the same
> >> way Debian backed up to LSB 4.1 ala https://lwn.net/Articles/658809/
> >
> > I gently recommend that you use the 2018 version of the standard; the
> group
> > did a lot of good work in those intervening years. That's the version I
> > shoot for.
>
> I used SUSv2. I upgraded to SUSv3. I upgraded to SUSv4. I'd happily
> evaluate
> SUSv5, but there isn't one because posix stopped having releases. Instead
> they
> randomly replace the existing data at the same URL, so if I point people
> at that
> website I have no idea what'll be there when they look.
>
> Why toybox does NOT do that kind of "continuous integration" without
> releases is
> one of the few toybox FAQ entries I actually got written up and posted to
> the
> website:
>
>   https://landley.net/toybox/faq.html#releases
>
> (I was going to say "finished" there, but another argument in favor of
> releases
> I didn't mention there is the "heartbeat" role. Just re-certifying that
> what we
> have is still current and still maintained is valuable, even if the
> changes are
> just a couple typos. Bumping the release schedule down to something less
> frequent makes sense for a less-active project, but SUSv4 came out a full
> 10
> years ago and what you have up is still "issue 7" at the same URL, despite
> having replaced it at least twice.)
>
> Previous posix releases had different URLs. If the posix list decided
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/c99.html is
> too old
> and instead of going back to the "cc" everybody actually uses, instead
> renumbered it to standard du jour (c18 apparently), nobody would even know
> c99
> had _been_ in SUSv4 unless they knew the magic stable URL. (Thank you for
> having
> one, by the way.) And no, that's not an academic concern: in the case of
> "tar"
> and "cpio", being able to pull up the old standard you dropped as a frame
> of
> reference for things was nice, and both commands cite the relevant old
> posix
> spec in the comments at the top. (Nobody uses "pax", and cpio -H newc is
> the
> basis for RPM and initramfs. We've been discussing teaching it about
> xattrs on
> the kernel list on and off for years now.)
>
> Let me know if posix ever cuts a new release.
>
> > I understand about Jorg. I'd like to be able to tell you to just ignore
> him
> > and listen to other voices, but I get that it's emotionally taxing and
> his
> > voice is loud enough to drown out others.
>
> Most projects have... certain individuals. There's talks on that too
> (https://www.youtube.com/watch?v=Q52kFL8zVoM) and I'm told Linus Torvalds
> himself spent a couple months in therapy last year.
>
> But in this case, in a public thread, nobody else spoke up with a different
> view. His voice was the ONLY voice. So I stopped listening.
>
> >> I still _sort_ of care about newer posix, but I got {bracket,expansion}
> working
> >> last year
> >
> > The group has discussed brace expansion. It's more or less a valid
> > extension not described by the standard.
>
> The failure mode of posix is the absence of stuff, to the point you can't
> boot a
> posix-only system (no init, no mount, I always assumed microsoft and IBM's
> need
> to pass FIPS 151-2 back in the day led to signing large checks to open
> holes big
> enough to drive NT and OS/360 through).
>
> I view it as a frame of reference to diverge from, and that's fine. It's
> still
> more useful than LSB. (Possibly less so than man7.org. Yes he has
> releases,
> they're at
> https://mirrors.edge.kernel.org/pub/linux/docs/man-pages/Archive/)
>
> >> and last month taught my $IFS splitting to understand utf8 characters
> >
> > ? I don't think there's anything in POSIX that restricts IFS to single-
> > byte characters, since everywhere it refers to a "character" it's
> supposed
> > to be understood that a character can consist of multiple bytes. The
> > standard defines the term that way.
>
> The bash man page defines "IFS whitespace" as different from unicode
> whitespace.
> (Space, tab, and newline only. Mine will in theory take the non-blank
> oggham
> whitespace, although I haven't added that to tests/sh.test yet. :)
>
> No idea what posix says about it, the last time I read the whole posix
> shell
> section end to end was... my blog says 2007. (I've triaged the command line
> utilities at length a lot more recently, for
> http://landley.net/toybox/roadmap.html#susv4 and
> https://landley.net/toybox/status.html . Including checking the 2013
> version to
> see if anything interesting seemed to have changed, in the before-Jorg
> times.)
>
> I am scrutinizing All The Behavioral Corner Cases in the world, but then I
> always do when I write a new command, just like
> https://landley.net/notes-2012.html#15-05-2012 and
> https://landley.net/notes-2012.html#13-04-2012 from forever ago.
>
> Posix is in there, but what the linux command line in my host distro does
> is at
> least as important. Half the time I look at posix it's because I'm trying
> to
> figure out what I might be able to get away with _excluding_. When I
> _first_
> started thinking about doing a proper shell (back when I was maintaining
> busybox
> and it had 4 shells), I started by printing the bash man page into a three
> ring
> binder:
>
>   https://landley.net/notes-2006.html#24-08-2006
>
> >> (and have a TODO item that if IFS is an array it should understand
> strings), and
> >> I honestly don't expect to live long enough for either NOT to be a
> divergence
> >> from Posix.
> >
> > I don't see POSIX ever standardizing arrays, and no conforming
> application
> > will ever expect IFS to be an array, so as long as you DTRT when IFS is a
> > string variable, you should be free to do whatever you like.
>
> I often try to document "deviations from posix" at the top of each
> command. I
> have a todo item as part of the eventual 1.0 release cleanup to go through
> and
> update all those sections as part of updating the test suite for Full
> Coverage.
> But that's a year's worth of work all by itself, and I don't get to work on
> toybox full time...
>
> Rob
> _______________________________________________
> Toybox mailing list
> Toybox at lists.landley.net
> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20200311/c846a261/attachment.htm>


More information about the Toybox mailing list