[Toybox] Would someone please explain what bash is doing here?

Chet Ramey chet.ramey at case.edu
Sun May 24 14:26:40 PDT 2020


On 5/23/20 11:19 PM, Rob Landley wrote:

>> Correct. Each element of a pipeline is executed in a subshell.
> 
> Hmmm.
> 
>>> But:
>>>
>>>   $ echo hello | read i; echo $i
>>>
>>> The read isn't saved because it's happening in a subshell context (so it sets an
>>> i that is discarded)?
>>
>> Correct. Since the read is executed in a subshell, it can't affect its
>> parent's environment.
> 
> It's actually easier for me _not_ to do that (because nommu support), but oh well.

You don't have to execute it in a subshell, but bash does. POSIX says
either way is acceptable.


> 
>>> And then:
>>>
>>>   $ while true; do continue | readlink /proc/self; done
>>>   28555
>>>   28557
>>>   28559
>>>   28561
>>>
>>> Is advancing the pid by 2 each time, because the _continue_ is in its own process?
>>
>> Each element of a pipeline is run in a subshell. That's how you can set its
>> process group and get it to respond to job control signals sent to the
>> terminal's process group.
> 
>   $ read i
>   ^Z^Z^Z^Z^Z^Z^Z^Z^Z
> 
> Why are pipelines different?

You have to have a child process to do job control, otherwise there's
nothing to suspend or move between the foreground and background. The
pipeline is the base object of job control: it's all child processes,
each child process is a member of the same process group, and that process
group owns the terminal. (Don't get hung up on terminology here; a simple
command and a compound command run in the background are also pipelines
according to the shell grammar.)

Bash could notice when it's getting a SIGTSTP while running a builtin,
fork, suspend the child, and make it into a job, but that way lies madness.

> 
>> POSIX says you can run any element of a pipeline
>> in the current shell context, but in practice nobody does that for any one
>> but the last, and bash only does it if `lastpipe' is set.
>>> It's truly a huge PITA to run the last element of the pipeline in the
>> current shell context when job control is enabled, keeping track of
>> process groups, handling signals like SIGTSTP, and forking at the right
>> time so you can suspend yourself if you need to. I've never been tempted.
>> I don't know how much trouble it was for Korn, but the zsh guys literally
>> fought bugs in that code for years.
> 
> Do they have a regression test suite? I'd love to harvest test cases...
> 
> Yes they do, and it has a README. Hmmm...
> 
>>>   $ while true; do continue | cat; echo hello; done
>>>   hello
>>>   hello
>>>   hello
>>>
>>>   $ while true; do break | cat; echo hello; done
>>>   hello
>>>   hello
>>>   hello
>>>
>>> continue and break are silently NOP in a pipe?
>>
>> What are they supposed to do? They can't affect the parent. All they can
>> do is complain, which would be annoying.
> 
> This does:
> 
>   for i in a b c d e & do echo $i; done
> 
> *shrug* I just expected it to be consistent.

Why is that consistent? Those are totally different scenarios. That `for'
loop is an honest-to-god syntax error, not an issue with parent-child
scoping.

> 
>>> Also, just confirming: $$ only shows the PID of the top level bash process, and
>>> there's no variable that shows the PID of (subshells) even though the point of a
>>> subshell is to spawn a new process?
>>
>> There is $BASHPID.

> It's another one of those magic variables that's assigned to by resolving it,
> and then keeps its last value.

It gets its value from getpid(). If the pid changes, $BASHPID changes, but
you do have to reference it.


>>> P.S. this is old, but:
>>>
>>>   $ for i in a b c & do echo $i; done
>>>   bash: syntax error near unexpected token `&'
>>>
>>> But break & is fine? What does that even _mean_?
>>
>> Come on. You at least have to implement the difference between a
>> `wordlist', which is a list of shell WORDs, and a command list, which
>> is terminated by the *operator* `&'.
> 
> ...no?

OK. I mean, it's the shell grammar, but you are obviously free to
implement whatever makes sense to you.


> By the way, did I already ask why {var}<file only works on block context and not
> on a command?
> 
>   $ export abc=potato
>   landley at driftwood:~/toybox/toybox$ env {abc}</dev/null | grep abc
>   abc=potato

What does `block context' mean? I mean, the idea behind the bash
implementation of {var} redirection is that you ask the shell to pick
an unused file descriptor for you and return it in the variable. That
variable isn't exported by default. The redirection persists beyond the
command -- bash doesn't require you to use `exec' with it -- because you
have a handle to the file descriptor and can manage it yourself.

It doesn't get exported here, or persist beyond env's exit, because the
redirection is done in the context of the child process after it's already
got the environment.

> 
> I mean, it's CHECKING the file:
> 
>   $ env {abc}</missing | grep abc
>   bash: /missing: No such file or directory
> 
> But it's closing the filehandle without doing anything with it?
> 
>   $ env {abc}>/dev/null
>   # abc=potato in here but it's a long list
>   $ ls /proc/self/fd
>   0  1  2  3

Redirections are performed in the child process environment, after it
gets the environment. The file descriptor assigned to abc (which is
essentially a no-op because env only deals with the exported environment)
disappears when env exits.

Maybe it would make sense to remake the environment after expanding that
redirection, but bash doesn't do that.


>> Unquoted `&' is always an operator, it is never a WORD, and so it can't
>> appear in a list of WORDs, which is what follows `in'.
> 
> Every command is terminated with either end of line or one of:
> 
>       // Flow control characters that end pipeline segments
>       s = end + anystart(end, (char *[]){";;&", ";;", ";&", ";", "||",
>         "|&", "|", "&&", "&", "(", ")", 0});
> 
> The word parsing logic returns either the end of next word within the string, or
> a null * to mean "unterminated quote" (which includes \ at the end of line). The
> loop calling parse_word() in parse_line() figures out how to assemble those
> words into blocks (and can return to its own caller asking for additional
> continuations because of unfinished blocks and here documents and trailing flow
> control characters).

So you're saying that `&' can either be a WORD (in the grammar sense) or an
operator depending on context? If that's the case, it's going to cause you
problems down the line.


> Anyway, from _my_ perspective "continue | thingy" and "if true | then" seem
> equally weird, but I guess not to yacc/bison. One is a syntax error, the other a
> silent NOP.

It makes more sense if you look at it from the perspective that `continue'
is a shell builtin, not a reserved word, and is meaningless to the grammar.
A builtin, regardless of what it has to do when it executes, is ok wherever
a simple command can appear. It doesn't have anything to do with bison.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet at case.edu    http://tiswww.cwru.edu/~chet/



More information about the Toybox mailing list