[Toybox] bash continues to confuse me.

Sat Jun 27 18:02:18 PDT 2020

Did I already respond to this message? I'm losing track...

On 6/21/20 12:37 PM, Chet Ramey wrote:
> On 6/18/20 7:48 PM, Rob Landley wrote:
>> On 6/18/20 1:46 PM, Chet Ramey wrote:
>>> On 6/17/20 1:22 PM, Rob Landley wrote:
>>>> Trying to figure out when spaces are and aren't allowed in ${blah} led to asking
>>>> why echo ${!a* } is an error but ${!a@ } isn't (when there are no variables
>>>> starting with a), and I eventually worked out that:
>>>>
>>>>   $ X=PWD
>>>>   $ echo ${!X at Q}
>>>>   '/home/landley/toybox/clean'
>>>>
>>>> Is going on? 
>>>
>>> It's variable transformation. It's introduced by `@' and uses single-letter
>>> operators. I cribbed the idea from mksh and extended it. The syntax is kind
>>> of loose because I'm still experimenting with it.
>>
>> I know what the @Q part does:
> 
> OK, sorry.
> 
> 
>> It's the circumstances under which ${!abc***} falls back to processing the ***
>> part that I'm trying to work out.
>>
>>   $ ABC=123; echo ${!ABC:1:2}
>>   21
>>
>> I _think_ what's happening is when you do ${!ABC@} it specifically checks for
>> "@}" after the variable name, and if so it lists prefixes. (And [@] does the
>> array thingy, haven't checked if that needs the } yet.) But if it doesn't end
>> with THAT SPECIFIC SEQUENCE, it does the !ABC substitution FIRST and takes
>> whatever's left after the original variable name (using normal "$WALRUS-x" logic
>> to figure out where it ended) and does further slicing to do on _that_ result.
> 
> More or less. The ${!var@} expansion is a special case, no question.  What
> it does is pretty much what I described in my previous message, which you
> end up quoting below:

Which makes ${!@} a special case within a special case? Ends with @} but doesn't
trigger prefix logic. But that's just a length check. And ${!^@} can search the
list without ever finding a match. Which doesn't explain:

  $ echo ${!1@}
  bash: ${!1@}: bad substitution

> Start at the character following `{' and read to the end of the parameter.
> Look at the character you're on, and dispatch depending on that. If it's
> `@', do one more character of lookahead to see if you've got a prefix
> operator or a possible variable transformation.
> 
> Once it figures out what the parameter part is, it does just what the man
> page says:
> 
> "If the first character of parameter is an exclamation  point  (!),  and
>  parameter is not a nameref, it introduces a level of indirection. Bash>  uses the value formed by expanding the rest of parameter as the new pa-
>  rameter;  this  is  then expanded and that value is used in the rest of
>  the expansion, rather than the expansion  of  the  original  parameter."

I gotta test all this combining with nameref. Great.

>> (I understand the implementation reason. I don't understand the _design_ reason.
>> As a language... why?)
> 
> I don't really care about the design reason. Whatever happened did so more
> than 40 years ago, and it's a waste of effort to worry about it.

Except I'm trying to figure out what behavior to implement, so I kind of have to
understand "why"...

>>>>   $ echo ${PWD:1:3 at Q}
>>>>   bash: PWD: 3 at Q: value too great for base (error token is "3 at Q")
>>>>   $ echo ${PWD:1 at Q}
>>>>   bash: PWD: 1 at Q: value too great for base (error token is "1 at Q")
>>>
>>> What's wrong with that? `@' and `Q' are valid characters in numeric
>>> constants when the base is large enough to need them. You can use
>>> bases up to 64 in base#constant.
>>
>> I.E. most expansions do not nest. The fact ${!x} does nest is an exception, and
>> ${!x@} is a special case within that exception.
> 
> Yes, variable indirection is an exception.

One exception, yes. In bash, ${!!} doesn't indirect $!, ${#!} doesn't print the
length of $!, ${!@Q} doesn't quote the value of $!, and I still don't understand
why:

  $ xx() { echo "${*@Q}";}; xx a b c d
  'a' 'b' 'c' 'd'
  $ xx() { echo "${@@Q}";}; xx a b c d
  'a' 'b' 'c' 'd'

produce the same output instead of the first one saying 'a b c d'.

>>>> Hmmm...
>>>>
>>>>   $ echo ${!potato at walrus}
>>>>
>>>>   $ echo ${!P at walrus}
>>>
>>> Invalid transformations just expand to nothing rather than being errors.
>>> That's part of what I'm still experimenting with.
>>
>> What is and isn't an error is not consistent, yes. I STILL haven't found an
>> answer to my first question, which is what was the difference between:
>>
>>   $ echo ${!potato* }
>>   bash: ${!potato* }: bad substitution
>>   $ echo ${!potato@ }
>>
>>   $
> 
> It really is the variable transformation. I'm not sure why you won't
> believe the answer.

It's not "won't believe", it's "don't understand". But I think I see what's
going on here: by "transformation" you mean the @Q stuff, so it sees the @
thinks it's going into @Q territory and hands off to code to parse that, which
then aborts because it isn't and _that's_ the error. But * never hands off to
anything because it's just unrecognized, so it works like ${!^} and ${!+} just
silently aborting and resolving to "".

How  this applies to the difference between:

  $ echo ${!^@}

  $ echo ${!1@}
  bash: ${!1@}: bad substitution

I do not yet understand.

I _think_ if bash says "bad substitution" and mine instead Does The Thing, that
can't introduce an incompatibility in existing scripts? I think?

> The debatable part is whether or not to short-circuit
> when the variable transformation code sees that the value it's being
> asked to transform is NULL, but that's what mksh does and one of the
> things I picked up when I experimented with the feature.

Does that produce a different result? Seems like it's just an optimization? (Do
you have a test case demonstrating the difference?)

>>>> Ok, when X exists and points to another variable, then !X becomes the contents
>>>> of that other variable and THEN the @ cares about trailing garbage. But * is a
>>>> different codepath from @...?
>>>
>>> It's a different expansion.
>>
>> In my code they're the same codepath with different $IFS behavior.
> 
> OK. That's probably going to turn out to be insufficient.

Quite possibly. (Hence all the test cases.)

But I do not yet understand _why_...

>>   $ chicken() { echo ${!*};}; fruit=123; chicken fruit
>>   123
>>   $ xx() { echo "${!@:2: -1}";}; yy=abcdef xx yy
>>   cde
> 
> The character that ends the parameter is `@' and the character after that,
> which determines the expansion, is `:'.

"Ends the parameter" is the tricky bit for me, because my code handling all the
$! and such is after this (because ${!} and $! both trigger it), so determining
whether $: is a thing isn't something _that_ bit of parsing is set up to do (it
happens later).

  $ xx() { echo "${!+:2: -1}";}; yy=abcdef xx yy
  $

I may have to break it out into a function so I can call it from multiple places.

>> The tricky part is being sure:
>>
>>   $ xx() { echo "${!@}";}; yy=abcdef xx yy
>>   abcdef
>>
>> is doing the "indirection through $@" thing not the "list all variables with an
>> empty prefix" thing. (I.E. getting the special case checking in the right order.)
> 
> There is no such thing as a variable with an empty prefix.

I.E. there is no syntax to show _all_ defined variables using ${!}.

> The parameter
> is !@; the ! introduces indirection, and the shell is left to do what it
> can with the `@'. What it does is expand it in a context in which word
> splitting does not take place, like on the rhs of an assignment statement.

Yeah, I definitely need to split out that function. Sigh.

So far the cases I have are:

  ${#} ${#x} ${#@} ${#[@]}
  ${!} ${!@} ${!@Q} ${!x} ${!x@} ${!x at Q} ${!x#} ${!x[*]}

All of which can have a slice afterwards (like :1:2 or /a/b), with the possible
exception of ${!x at Q:1:2} not being a thing but it's possible I'd have to special
case NOT supporting it...

Which still leaves me with stuff like:

  xx() { echo ${!@@};}; xx a

Which.. I have no idea what it's doing? (The downside of the error behavior
being "resolve to empty string".)

  $ xx() { echo ${!@@};}; xx a b c
  bash: a b c: bad substitution
  $ xx() { echo ${!@@};}; xx a

  $ xx() { echo ${!@@Q};}; xx a

  $ xx() { echo ${!@Q};}; xx a
  bash: ${!@Q}: bad substitution
  $ xx() { echo ${!*Q};}; xx a
  bash: ${!*Q}: bad substitution
  $ xx() { echo ${!*@Q};}; xx a

  $ xx() { echo ${!*+};}; xx a

  $ a=PATH; xx() { echo ${!@@Q};}; xx a
  'PATH'

Sometimes it recognizes it enough to fail, and other times it silently fails
without saying why, and I have to rummage around quite a lot to see if there's
hidden behavior that _can_ be triggered...

>>> Invalid transformation operators just expand to nothing.
>>
>> Some things error, some things expand to nothing, If there's a pattern I don't
>> understand it yet. 
> 
> In this case, it's how mksh treats it and whether or not I wanted that much
> compatibility when I implemented it. So far, I've come down on the side of
> compatibility.

Not having played with or read the code of 6 other shells, I am sadly at a loss
here. I'm just trying to figure out a consistent set of rules for how bash
behaves. (Or what the minimum number of rules to cover the maximum amount of
bash behavior would be.)

> On the other hand, if my code handles cases yours errors on,
>> I'm not exactly being incompatible, am I?
>>
>> I mean I'm assuming:
>>
>>   $ ABC=def:1
>>   $ def=12345
>>   $ echo ${!ABC}
>>   bash: def:1: bad substitution
>>   $ echo ${def:1}
>>   2345
>>
>> not working is maybe some sort of security thing,
> 
> `def:1' is not a valid parameter name. In the second `echo', `def' is the
> parameter name.

I was trying to determine order of operations on evaluation: the indirection
results are not "live" for slice logic processing. Part of that whole:

  $ a=a
  $ echo ${!a}
  a
  $ declare -n a
  $ echo $a
  bash: warning: a: circular name reference
  $ echo ${!a}
  bash: warning: a: circular name reference

thing.

>  although it makes:
>>
>>   $ x=?; echo ${!x/0/q}
>>   q
>>   $ x=^; echo ${!x/0/q}
>>   bash: ^: bad substitution
>>
>> a bit harder to get right since as I said, my code for recognizing $? is in
>> expand_arg() and not in varlen() or getvar(). 
> 
> There are valid parameters, and there are invalid variable parameters.

The man page calls them "special" parameters.

>> But it expand_arg seems to be
>> where it belongs because there's a bunch of:
>>
>>   $ declare -p 'x'
>>   declare -- x="^"
>>   $ declare -p '?'
>>   bash: declare: ?: not found
> 
> Do you think declare should do syntax checking on whether or not its
> arguments are valid identifiers? Even given that function names don't
> have to be valid identifiers?

"declare -p" without parameters does not list $! and $? $0 and friends. They're
a different category of variable. To honor that, I implemented that category of
variable not in the Big Array Of Variables but hardwired in the variable
resolution parsing logic. But then recognizing that ${?:2:7} and ${+:2:7} are
different becomes tricky for me. I think I need to split those hardwired
recognized names out into a function that reports how many bytes of input it
recognized. (Can be >1, ${12} is a thing but "$12" is "${1}2" and yes I need a
test for that...)

I think I can cheat with strchr(*s, "*@#?-$!_0123456789") to detect _that_ I
have a special parameter. Still gotta break resolution out into a function, and
@* is its own processing for both # and !...

>> incompatible sets of plumbing for doing the same thing:
>>
>>   $ walrus='?'
>>   $ declare -n walrus
>>   bash: declare: `?': invalid variable name for name reference
> 
> Yeah, that's what ksh93 does there. It rejects the attempt to make walrus
> a nameref, but it's still exists as a variable.

  $ echo ${*@notanerror}

  $ echo ${*-yetthisislive}
  yetthisislive
  $ echo ${*:potato}
  bash
  $ echo ${*=abc}
  bash: $*: cannot assign in this way

Somehow, I don't think I'm going to run OUT of corner cases. This hole has no
bottom.

>>   $ echo ${!walrus}
>>   1
> 
> So the variable indirection works.

My point was they're mostly doing the same thing, but aren't compatible.

Rob