[Toybox] Would someone please explain what bash is doing here?

Chet Ramey chet.ramey at case.edu
Sat Mar 7 13:38:41 PST 2020


On 3/6/20 9:05 PM, Rob Landley wrote:

>>> you could try chet ramey or the bash mailing list... he joins in a lot
>>> of the shell discussions on the POSIX mailing list.
>>
>> I'm reluctant to trigger a change in bash's behavior that it's been doing
>> consistently since at least 2002, but I am curious WHY it's doing this.
>>
>> It's start _or_ end of the list that the empty argument drops out, by the way:
> 
> Nevermind, I think I've figured most of it out. The leading/trailing part is
> because:
> 
> bash -c 'IFS=xy; for i in axyxb$@$@cyd; do IFS=z; echo =$i=; done' \
>   one "" abc dxf ghi
> 
> The echo expansion is ALSO getting split.
> 
> I'm currently trying to figure out why:
> 
> $ bash -c 'IFS=xy; for i in axyxb$@; do echo =$i=; done' one f "" abc
> =a   bf=
> ==
> =abc=
> 
> is producing THREE spaces, but mine is doing:
> 
> =a  bf=
> ==
> =abc=
> 
> It turns into echo "=a" "" "bf=" and when I do that from the command line I get
> a space before and a space after the NULL argument? But with bash there are
> three? Is it turning into two NULL entries? Ah, it SHOULD do that because there
> are two consecutive non-whitespace IFS separators that don't bind to an existing
> string (like the first one does). So why _isn't_ mine doing that...

Your analysis is pretty much spot on.

The first thing, as you already noted, is that the argument to `echo' is
also being split on $IFS. If you quote that, you see that what you get out
of the for loop is

=axyxbf=
==
=abc=

That's because the $@ expands to three arguments, concatenated to what
comes before and after, as described in

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_05_02

which can be further split, under certain circumstances. POSIX isn't
exactly precise about which expansions the `word list' following `in'
undergoes ("the list...shall be expanded"), but those don't include word
splitting, so this isn't one of those circumstances.

There is some difference here between implementations, which is why POSIX
says that "empty fields may be discarded" (ksh93 discards them, but nobody
else does).

However, your example doesn't quote the argument to `echo', so those get
further split. The second and third lines aren't interesting, so I won't
mention them. In the first case, i=axyxbf and so the word to be split is
"=axyxbf=". Straightforward so far.

The characters in IFS are field delimiters, so they terminate fields. That
means you end up with "=a" "" "" bf=. Since they aren't IFS whitespace,
word splitting doesn't swallow runs of multiple IFS characters, and each
occurrence of `x' or `y' delimits a separate field. This is as described in

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05

So `echo' gets "=a" "" "" "bf=", and displays each argument, separating
them with spaces. QED.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet at case.edu    http://tiswww.cwru.edu/~chet/



More information about the Toybox mailing list