[Toybox] Would someone please explain what bash is doing here?

Sun May 10 10:13:33 PDT 2020

On 5/8/20 4:17 PM, Rob Landley wrote:
> On 5/6/20 2:32 PM, Chet Ramey wrote:
>> On 5/6/20 2:08 PM, Rob Landley wrote:
>>
>>>> You're blogging these bash corner cases, too?
>>>
>>> I was. I was recently asked to stop.
>>
>> Who asked you to stop?
> 
> Somebody on patreon. (I also stopped patreoning.)

I read the comment. It's too bad that donations come with those kinds of
strings attached. It seems like the commenter is uncomfortable seeing
political comments he doesn't agree with -- nobody complains about
political comments that are aligned with their own views.

> I'm just trying to figure out what the behavior _is_.
> 

>   $ echo \
>   > $LINENO
>   2
> 
>   $ echo $LINENO \
>   $LINENO
>   1 1

Let's look at these two. This is one of the consequences of using a parser
generator, which builds commands from the bottom up (This Is The House That
yacc Built).

You set $LINENO at execution time based on a saved line number in the
command struct, and that line number gets set when the parser knows that
it's parsing a simple command and begins to construct a struct describing
it for later execution.

In all cases, the shell reads a line at a time from wherever it's reading
input. In the first case, it reads

"echo \"

and starts handing tokens to the parser. After getting `echo', the parser
doesn't have enough input to determine whether or not the word begins a
simple command or a function definition, and goes back for more input. The
lexer sees there are no tokens left on its current input line, notes that
line ends in backslash and reads another line, incrementing the line
number, throws away the newline because the previous line ended in
backslash, and returns $LINENO. The parser finally has enough input to
reduce to a simple command, and builds one, with the line number set to 2.

In the second case, the lexer reads the complete line "echo $LINENO \"
and starts handing back the tokens. The parser has enough tokens to reduce
to a simple command before it goes back for more input to complete it,
and the lexer processes the backslash-newline. The line number is set to 1,
or at least not incremented, when the parser begins to build the simple
command struct.

There's some variation in this area, by the way, but everyone agrees on
the basics: the line number gets incremented when you process the
backslash-newline. If you use a recursive-descent parser you have a little
more flexibility with this case and several others.

>>>>> I currently have no IDEA what "sh --help" should look like when I'm done, 
>>>>
>>>> I'm pretty sure bash --help complies with whatever GNU coding standards
>>>> cover that option.
>>>
>>> Currently 2/3 of bash --help lists the longopts, one per line, without saying
>>> what they do. So yeah, that sounds like the GNU coding standards.

Oh, please. It doesn't describe what each single-character option does,
either. That's a job for a man page or texinfo manual.

> something like:
> 
> ---
> 
> Usage: bash [-ilrsDabefhkmnptuvxBCHP] [-c COMMAND] [-O SHOPT] [SCRIPT_FILE] ...
> 
> Long options:
> 	--debug --debugger --dump-po-strings --dump-strings --help --init-file
> 	--login --noediting --noprofile --norc --posix --rcfile --restricted
> 	--verbose --version
> 
> For -O SHOPT list 'bash -c "help set"', for more information 'bash -c help'
> or visit https://www.gnu.org/software/bash or run "man 1 bash".

That's certainly an acceptable way to present it.

> 
> ---
> 
> Except you've got some parsing subtlety in there I don't, namely:
> 
>   $ bash -hc 'echo $0' --norc
>   --norc
> 
>   $ bash -h --norc -c 'echo $0'
>   bash: --: invalid option

"Bash also  interprets  a number of multi-character options.  These op-
 tions must appear on the command line before the  single-character  op-
 tions to be recognized."

Bash has always behaved this way, back to the pre-release alpha and beta
versions, and I've never been inclined to change it.

> And some of this is just never going to parse the same way:
> 
>   $ bash -cs 'echo $0'
>   bash

This is ambiguous, but not in the way you expect. The thing that differs
between shells is whether or not they read input from stdin (because of
the -s option) after executing the `echo $0'. POSIX specifies them as
separate cases, so nobody should expect anything in particular when they
are combined. The ash-derived shells start reading from standard input,
bash and the ksh-like shells exit after executing the echo, and yash
rejects the option combination entirely.

> But again, you have to conform to the gnu style guidelines, which I thought
> means you'd have a texinfo page instead of a man page?

I have both.

> 
> Also, I dunno why -O blah
> is a seprate namespace from "bash --pipefail", 

I assumee you mean `-o pipefail'. I abandoned the -o namespace to POSIX a
long time ago, and there is still an effort to standardize pipefail as
`-o pipefail', so I'm leaving it there. I originally made it a -o option so
we could try and standardize it, and that reasoning is still relevant.

> ----------
> Usage: sh [--LONG] [-ilrsD] [-abefhkmnptuvxBCHP] [-c CMD] [-O OPT] [SCRIPT] ...
> 
> -c	Run CMD then exit (with -s continue reading from stdin)

You can, of course, do anything you want with this and remain POSIX
conformant.

> Do you really need to document --help in the --help text? 

Why not? It's one of the valid long options.

The bash man page does
> not include the string "--debug" (it has --debugger but not --debug), 

It's just shorthand for the benefit of bashdb.

> --dump-strings is -D which again:
> 
> $ bash --dump-strings
> bash-4.4$ help
> bash-4.4$ echo hello> bash-4.4$ exit
> bash-4.4$ break
> bash-4.4$ stop
> bash-4.4$ ^C
> bash-4.4$ ^C
> bash-4.4$ ^C
> bash-4.4$

What point are you trying to make here? There aren't any translatable
strings using the $"" notation to write to standard output, and the
documntation for -D clearly says it implies -n. Should it not print the
prompt? Should it scold the user for running it in interactive mode? I
will admit that it's never been used as widely as I thought it might be,
but it works as advertised.

> P.S. --posix isn't -p, that's "privileged" mode which is not the same as
> restricted mode and I'm walking away from the keyboard for a bit now.

Yeah, -p was already used when I implemented posix mode, so I went with
`-o posix'. `--posix' is just more notational shorthand.

> 
> P.P.S. the man page has --init-file but the --help output doesn't.

Incorrect.

$ ./bash --help | grep init
	--init-file

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet at case.edu    http://tiswww.cwru.edu/~chet/