[Toybox] Would someone please explain what bash is doing here?

Rob Landley rob at landley.net
Sat May 23 20:19:52 PDT 2020


On 5/23/20 5:51 PM, Chet Ramey wrote:
> On 5/23/20 1:11 PM, Rob Landley wrote:
>> Starting to open the job control can of worms, and:
>>
>>   $ while true; do readlink /proc/self | cat - $$; done
>>   24658
>>   cat: 20032: No such file or directory
>>   24660
>>   cat: 20032: No such file or directory
>>   24662
>>
>> Is calling readlink and cat each time through the loop (true is a builtin), so
>> the pid advances by 2 and the pipeline is NOT a subshell. 
> 
> Correct. Each element of a pipeline is executed in a subshell.

Hmmm.

>> But:
>>
>>   $ echo hello | read i; echo $i
>>
>> The read isn't saved because it's happening in a subshell context (so it sets an
>> i that is discarded)?
> 
> Correct. Since the read is executed in a subshell, it can't affect its
> parent's environment.

It's actually easier for me _not_ to do that (because nommu support), but oh well.

>> And then:
>>
>>   $ while true; do continue | readlink /proc/self; done
>>   28555
>>   28557
>>   28559
>>   28561
>>
>> Is advancing the pid by 2 each time, because the _continue_ is in its own process?
> 
> Each element of a pipeline is run in a subshell. That's how you can set its
> process group and get it to respond to job control signals sent to the
> terminal's process group.

  $ read i
  ^Z^Z^Z^Z^Z^Z^Z^Z^Z

Why are pipelines different?

> POSIX says you can run any element of a pipeline
> in the current shell context, but in practice nobody does that for any one
> but the last, and bash only does it if `lastpipe' is set.
>> It's truly a huge PITA to run the last element of the pipeline in the
> current shell context when job control is enabled, keeping track of
> process groups, handling signals like SIGTSTP, and forking at the right
> time so you can suspend yourself if you need to. I've never been tempted.
> I don't know how much trouble it was for Korn, but the zsh guys literally
> fought bugs in that code for years.

Do they have a regression test suite? I'd love to harvest test cases...

Yes they do, and it has a README. Hmmm...

>>   $ while true; do continue | cat; echo hello; done
>>   hello
>>   hello
>>   hello
>>
>>   $ while true; do break | cat; echo hello; done
>>   hello
>>   hello
>>   hello
>>
>> continue and break are silently NOP in a pipe?
> 
> What are they supposed to do? They can't affect the parent. All they can
> do is complain, which would be annoying.

This does:

  for i in a b c d e & do echo $i; done

*shrug* I just expected it to be consistent.

>> Also, just confirming: $$ only shows the PID of the top level bash process, and
>> there's no variable that shows the PID of (subshells) even though the point of a
>> subshell is to spawn a new process?
> 
> There is $BASHPID.

Huh, I grepped for declare -p output with a pid in range...

  $ declare -p | grep BASHPID
  declare -ir BASHPID

Ah, that would explain why.

  $ echo $BASHPID
  25545
  $ (declare -p | grep BASHPID)
  declare -ir BASHPID="25545"

It's another one of those magic variables that's assigned to by resolving it,
and then keeps its last value.

  $ declare -p SECONDS
  declare -i SECONDS="115"
  $ declare -p SECONDS
  declare -i SECONDS="117"
  $ declare -p SECONDS
  declare -i SECONDS="118"

Anyway, good to know. Thanks.

>> P.S. this is old, but:
>>
>>   $ for i in a b c & do echo $i; done
>>   bash: syntax error near unexpected token `&'
>>
>> But break & is fine? What does that even _mean_?
> 
> Come on. You at least have to implement the difference between a
> `wordlist', which is a list of shell WORDs, and a command list, which
> is terminated by the *operator* `&'.

...no?

I read lines from input. (Haven't implemented interactive command history
editing yet, it's a todo.)

There's parse_word() which finds the end of the current word, and figures out
when you need to ask for continuations due to unterminated quoting, which
includes $() and friends. And yes it handles "$("echo $('ls') )")" and so on.

Then there's a parse_line() which figures out when to ask for line continuations
due to flow control. (if/fi do/while which includes () and {}, and also trailing
flow control ala && || and also HERE documents... The function returns
hit/pass/bust to the caller in sh_main(), which does the $PS1 prompting and
feeds it more lines, or calls run_function().)

When it's got a complete thought, it calls run_function() on the parsed block
structure returned by parse_line(), and run_function() traverses the flow
control and calls run_command() which calls expand_redir() to get a argc/argv[]
pair with all the variables expanded and all the redirections performed (with an
unredir list you traverse to put them _back_, the original filehandles are duped
up above 10 where {blah}<abc and friends meddle anyway, and that's so nommu
doesn't handle any file access error cases after vfork()...)

I should writeup a walkthrough of all this when it's done, but it's still in
flux a bit as I hit each new "no, that doesn't work" and reshuffle stuff.

By the way, did I already ask why {var}<file only works on block context and not
on a command?

  $ export abc=potato
  landley at driftwood:~/toybox/toybox$ env {abc}</dev/null | grep abc
  abc=potato

I mean, it's CHECKING the file:

  $ env {abc}</missing | grep abc
  bash: /missing: No such file or directory

But it's closing the filehandle without doing anything with it?

  $ env {abc}>/dev/null
  # abc=potato in here but it's a long list
  $ ls /proc/self/fd
  0  1  2  3

(I just made mine work in both contexts, I think? It was easier...)

Huh:

  $ exec {abc}>/dev/null
  $ echo $abc
  10
  $ ls /proc/self/fd
  0  1  10  2  3

Ok, THAT works. The redirect neither sets the variable for a command, nor keeps
the redirect after the command... I guess exec and block end redirect logic are
a similar codepath?

> Unquoted `&' is always an operator, it is never a WORD, and so it can't
> appear in a list of WORDs, which is what follows `in'.

Every command is terminated with either end of line or one of:

      // Flow control characters that end pipeline segments
      s = end + anystart(end, (char *[]){";;&", ";;", ";&", ";", "||",
        "|&", "|", "&&", "&", "(", ")", 0});

The word parsing logic returns either the end of next word within the string, or
a null * to mean "unterminated quote" (which includes \ at the end of line). The
loop calling parse_word() in parse_line() figures out how to assemble those
words into blocks (and can return to its own caller asking for additional
continuations because of unfinished blocks and here documents and trailing flow
control characters).

parse_line() has an "expect" stack which is the word (at start of statement)
that terminates the current. The parsing also knows that ( and ) start a new
line, I.E. they're commands _and_ flow control characters, and yes it has to
check for (( and )) but getting this right:

  ((echo hello) | cat)
  $((echo hello) | cat)

took some doing and it has to retroactively break the (( ...

(I cheated slightly: I use a 4k buffer to store the parentheses stack, and if
that overflows with 4096 nested parentheses it's "syntax error: tilt".).

Anyway, the parse_line plumbing records the flow control character that
terminated the line in arg->v[arg->c] (with newline or semicolon being saved as
NULL there), but some places (such as the end of the in list) can't have a
non-null terminator because that's a syntax error. So those places check for that.

parse_line() checks for a bunch of several syntax errors: << with no label,
"function(" without ")" or next word isn't a { ... and yes _word_:

  $ function(){echo potato;}
  bash: syntax error near unexpected token `('

What else... ;; outside case, flow control without a statement, and a bunch of
"for" cases (for on its on line, for i X where the X isn't in, ((, or do, more
than one line after "in" without a do...) and so on.

Afterwards run_function() mostly assumes the syntax of whatever it's dealing
with is correct and doesn't re-check it, but "continue" and "break" are special.
(They're normal type 0 commands but they modify flow control local to
run_function() instead of going through run_command()...)

Anyway, from _my_ perspective "continue | thingy" and "if true | then" seem
equally weird, but I guess not to yacc/bison. One is a syntax error, the other a
silent NOP.

> Chet

Rob



More information about the Toybox mailing list