[Toybox] bash continues to confuse me.

Rob Landley rob at landley.net
Wed Jul 1 04:28:45 PDT 2020


On 6/30/20 1:23 PM, Chet Ramey wrote:
>> Which doesn't explain:
>>
>>   $ echo ${!1@}
>>   bash: ${!1@}: bad substitution
> 
> What's that supposed to do?

I honestly forget. (Possibly I expected it to show 1 10 11 but they're different
types of variables even when they exist and resolve...)

> It's not prefix matching, since `1' is not a
> valid variable initial character. It's not indirection plus parameter
> transformation, since there's no transformation operator. It's not straight
> variable indirection, since `1@' isn't a valid parameter name.

Every time I step away from this for the evening and come back to it, I get
confused about the order of precedence again. But I have test/results lists that
mostly sort it out now.

>>>> (I understand the implementation reason. I don't understand the _design_ reason.
>>>> As a language... why?)
>>>
>>> I don't really care about the design reason. Whatever happened did so more
>>> than 40 years ago, and it's a waste of effort to worry about it.
>>
>> Except I'm trying to figure out what behavior to implement, so I kind of have to
>> understand "why"...
> 
> Then the `why' as `because that's a choice Bourne made in 1978' should
> suffice, right?

I'm not really trying to figure out why the behavior was chosen, that seems
arbitrary and historic with archaeological layers.

I'm trying to figure out the minimum set of rules to capture the necessary
behavior, and whether outliers from those rules are important or coincidental
and possibly ignorable.

(I have recently been reminded by another shell's maintainer that I'm not smart
enough to have a chance of ever actually doing this, but I've never let that
stop me before.)

> The stuff in the braces after the `#' is the entire
> parameter for the length expansion. It's an expansion unto itself. That's
> how it's always worked. I don't feel like `extending' it.

I'm not suggesting you should, I'm just trying to figure out what I need to
implement. (There is a bit of flailing. It's not quite all fitting in my head,
which is why I'm taking notes and reviewing them. Collating it's a nightmare,
there's no natural sequencing to some of this. Or at least finding the
sequencing and finding the underlying rules are the same problem...)

>>> Yes, variable indirection is an exception.
>>
>> One exception, yes. In bash, ${!!} doesn't indirect $!,
> 
> It's true. That's never going to be useful, so bash just doesn't implement
> `!' as one of the special parameters for which indirection is valid. But
> you're right, it's inconsistent to not just accept it and expand to nothing.
> 
>  ${#!} doesn't print the
>> length of $!,
> 
> Sure it does.

  $ echo ${#!}
  bash: !}: event not found
  $ echo "${#!}"
  bash: !}: event not found

> You just have to have $! defined to get something useful.

  $ true &
  [1] 30273
  $ echo "${#!}"
  bash: !}: event not found
  $ echo ${!#}
  bash

> Before you create an asynchronous process, it doesn't exist. If you haven't
> created any background processes, you just get 0 for the length.

Possibly there's a variable or switch that disables ! being special and then it
would work. The context sensitive parsing doesn't do it in this case, but does
for ${!} and ${!@} which is why I thought it would.

>> ${!@Q} doesn't quote the value of $!,
> 
> See above. As it turns out, `@' is, in fact, one of the special parameters
> you can indirect, so the parameter is `!@' and the `Q' makes it a bad
> substitution.

And alas I can't ${! @Q} to clarify what I mean I like I can with ${x: -2}.

>  and I still don't understand
>> why:
>>
>>   $ xx() { echo "${*@Q}";}; xx a b c d
>>   'a' 'b' 'c' 'd'
>>   $ xx() { echo "${@@Q}";}; xx a b c d
>>   'a' 'b' 'c' 'd'
>>
>> produce the same output instead of the first one saying 'a b c d'.
> 
> Again, you can either believe the answer or not. I'm not going to keep
> repeating it.

s/believe/understand/ but I agree repeating it won't help me. :)

>> I _think_ if bash says "bad substitution" and mine instead Does The Thing, that
>> can't introduce an incompatibility in existing scripts? I think?
> 
> Correct. Unless some script is, for whatever bizarre reason, counting on
> the error.

Counting on the error is unlikely. Counting on a _lack_ of error for something
broken that never triggers: plausible.

>>> The debatable part is whether or not to short-circuit
>>> when the variable transformation code sees that the value it's being
>>> asked to transform is NULL, but that's what mksh does and one of the
>>> things I picked up when I experimented with the feature.
>>
>> Does that produce a different result? Seems like it's just an optimization? (Do
>> you have a test case demonstrating the difference?)
> 
> It's whether or not to flag an unrecognized transformation operator as an
> error or just short-circuit before checking that because the value to be
> transformed is NULL.

Yeah, I've been wondering about that, which definitely _can_ break scripts.

But implementing it seems tricky: ${x;%} reliably errors whether or not x is
set, ${x~#%} never does (I can't find what ~ is supposed to do here in the man
page, but it only expands to home directory at start of word, even ""~landley
doesn't expand), and ${x at z} only does so when x is set...

At a certain point I'm going to have to try scripts that break, and get bug
reports from people who expected something to work and are angry at me.
(Apparently it is vitally important that I care about a bash reimplementation of
readline, which somehow manages to be both implemented in bash and to have a
makefile. I've put it on the todo list.)

>> Which still leaves me with stuff like:
>>
>>   xx() { echo ${!@@};}; xx a
>>
>> Which.. I have no idea what it's doing? (The downside of the error behavior
>> being "resolve to empty string".)
>>
>>   $ xx() { echo ${!@@};}; xx a b c
>>   bash: a b c: bad substitution
> 
> Already explained in the message you quoted.

I'll figure it out. Sorry to bother you so much. (I need to review this whole
thread again and pull out tests I've missed...)

>> Not having played with or read the code of 6 other shells, I am sadly at a loss
>> here. I'm just trying to figure out a consistent set of rules for how bash
>> behaves. (Or what the minimum number of rules to cover the maximum amount of
>> bash behavior would be.)
> 
> Maybe we can make bash more consistent as a result, where that makes sense.

I'm crawling off to my hole to write code about it. Happy to share the results
if they make sense, but I've reached the "need to close mental tabs" stage of
the process...

>>> `def:1' is not a valid parameter name. In the second `echo', `def' is the
>>> parameter name.
>>
>> I was trying to determine order of operations on evaluation: the indirection
>> results are not "live" for slice logic processing.
> 
> Indirection results are always a parameter name, not an operator.

Yup.

>  Part of that whole:
>>
>>   $ a=a
>>   $ echo ${!a}
>>   a
>>   $ declare -n a
>>   $ echo $a
>>   bash: warning: a: circular name reference
>>   $ echo ${!a}
>>   bash: warning: a: circular name reference
>>
>> thing.
> 
> Yes, if you're using namerefs, the nameref semantics for ${!var} take
> precedence. Part of the nameref compatibility thing. It's documented
> that way. namerefs are uglier and less valuable than I anticipated when
> I implemented them.

I have not yet implemented namerefs. (Or integer variables. Or arrays. I've done
local, global, and read only so far. Namerefs and arrays have read-side logic
(declare -aAn), I think the rest (declare -ilrux) is all assignment-side?)

(I might skip declare -t: debug feature, possibly out of scope...)

>>>  although it makes:
>>>>
>>>>   $ x=?; echo ${!x/0/q}
>>>>   q
>>>>   $ x=^; echo ${!x/0/q}
>>>>   bash: ^: bad substitution
>>>>
>>>> a bit harder to get right since as I said, my code for recognizing $? is in
>>>> expand_arg() and not in varlen() or getvar(). 
>>>
>>> There are valid parameters, and there are invalid variable parameters.
>>
>> The man page calls them "special" parameters.
> 
> There are some special parameters that are valid in indirections, but `^'
> is not a special parameter.

Sure, I was using it to cause an error.

My code was organized wrong: the getvar() and varend() logic don't recognize
stuff like $! and this needed to, so I factored out my $? handling if/else
staircase into a getvar_special() wrapper around getvar() so I could handle the
full range of variable names this part needs to accept.

Without factoring it out I had a sequencing issue that the slice parsing didn't
know $! could be a variable name, and couldn't resolve ${!*} twice because
getvar("*") wasn't in the list declare sees. Now I've got a function with the
extras values, which can tell me the length it encountered. (And I have to pass
_in_ the length it's allowed to consume so ${12} and $12 behave differently...)

I think that part's working now. The above was just me trying to figure out if I
could _avoid_ doing it, and the answer was no.

>>>> But it expand_arg seems to be
>>>> where it belongs because there's a bunch of:
>>>>
>>>>   $ declare -p 'x'
>>>>   declare -- x="^"
>>>>   $ declare -p '?'
>>>>   bash: declare: ?: not found
>>>
>>> Do you think declare should do syntax checking on whether or not its
>>> arguments are valid identifiers? Even given that function names don't
>>> have to be valid identifiers?
>>
>> "declare -p" without parameters does not list $! and $? $0 and friends. They're
>> a different category of variable.
> 
> Sure. That's why declare reports `?' as not found. It's not a variable.

Which is why I needed to factor out the second function, which DOES know about
it. (Sometimes it's a variable, sometimes it isn't...)

>>   $ echo ${*@notanerror}
> 
> Well, you can short-circuit if there are no positional parameters (in which
> case `*' ends up expanding to null), or you can error because neither `n'
> (bash-5.0) nor `notanerror' is a valid transformation operator. It's the
> same thing as above.
> 
>>   $ echo ${*-yetthisislive}
>>   yetthisislive
> 
> Defined by posix.

When I get to the end of the bash man page (well, loop and do a pass with no
hits), I intend to do a pass over posix-2008 to see what I missed. (I did read
the whole thing once upon a time, it's just been a while.)

Until then I wince at every mention of it because when the _only_ reason for
something is "posix"... ("Yes but why?" "Posix!")

Although in this case I've already gotten far enough to work out that there are
operators that trigger on NULL variables (${x-y} ${x=y} ${x?y} but NOT ${x+y})
and those never get short circuited out because they'd be shorted out by the
variable being _set_, which is a silly time to avoid syntax checking.

So this isn't "because posix", this one I understand the rule for, it's an
operator category, although the category is fuzzed a little by ${:+} which
shares the "maybe :" logic but triggers in the else case of that test.

>>   $ echo ${*:potato}
>>   bash
> 
> `potato' is an arithmetic expression, which evaluates to 0,

Oh, this is the $(( )) context where variable names get resolved (and can even
be assigned to).

  $ xx() { echo ${*:1+2};}; xx one two three four five six seven
  three four five six seven

Ha! That makes sense, and easy test to add.

I have a great big todo item to make a math parser. I did an elaborate one in
java years ago that handled triginometric functions and fractional
exponentiation and such. But that was a long enough time ago I still thought
<strike>digital watches</a> Java was a pretty neat idea. I remember there were
two stacks and I learned why reverse polish notation exists. You'd compare
precedence to see whether you push the operation and argument, or perform it now
and possibly consume your way down the tree, which is why I had to bother the
posix guys to put the precedence BACK when they broke it in their html rendering
of the expr command years ago because when I sat down to try it there they'd
broken the spec...

https://web.archive.org/web/20170904075912/http://permalink.gmane.org/gmane.comp.standards.posix.austin.general/10141

Anyway, Wikipedia[citation needed] says what I implemented was probably the
precedence climbing variant of the shunting yard algorithm? Sounds vaguely
familiar...

> so it's the
> same as echo ${*:0}, which expands to the positional parameters starting
> from 0.

Yup.

>>   $ echo ${*=abc}
>>   bash: $*: cannot assign in this way
> 
> Of course you can't. No shell lets you do this.

I was impressed it had its own error message, while digging into the "error vs
discarded" differences...

>>>>   $ echo ${!walrus}
>>>>   1
>>>
>>> So the variable indirection works.
>>
>> My point was they're mostly doing the same thing, but aren't compatible.
> 
> Variable indirection and namerefs? They're not. Indirection was my attempt
> to get most of the value out of namerefs without all of the internal
> plumbing.

Me, I wanted to factor out the plumbing and make them use common code. Not
looking particularly amenable to that so far...

Rob


More information about the Toybox mailing list