[Toybox] bash continues to confuse me.
Rob Landley
rob at landley.net
Sat Jun 27 18:02:18 PDT 2020
Did I already respond to this message? I'm losing track...
On 6/21/20 12:37 PM, Chet Ramey wrote:
> On 6/18/20 7:48 PM, Rob Landley wrote:
>> On 6/18/20 1:46 PM, Chet Ramey wrote:
>>> On 6/17/20 1:22 PM, Rob Landley wrote:
>>>> Trying to figure out when spaces are and aren't allowed in ${blah} led to asking
>>>> why echo ${!a* } is an error but ${!a@ } isn't (when there are no variables
>>>> starting with a), and I eventually worked out that:
>>>>
>>>> $ X=PWD
>>>> $ echo ${!X at Q}
>>>> '/home/landley/toybox/clean'
>>>>
>>>> Is going on?
>>>
>>> It's variable transformation. It's introduced by `@' and uses single-letter
>>> operators. I cribbed the idea from mksh and extended it. The syntax is kind
>>> of loose because I'm still experimenting with it.
>>
>> I know what the @Q part does:
>
> OK, sorry.
>
>
>> It's the circumstances under which ${!abc***} falls back to processing the ***
>> part that I'm trying to work out.
>>
>> $ ABC=123; echo ${!ABC:1:2}
>> 21
>>
>> I _think_ what's happening is when you do ${!ABC@} it specifically checks for
>> "@}" after the variable name, and if so it lists prefixes. (And [@] does the
>> array thingy, haven't checked if that needs the } yet.) But if it doesn't end
>> with THAT SPECIFIC SEQUENCE, it does the !ABC substitution FIRST and takes
>> whatever's left after the original variable name (using normal "$WALRUS-x" logic
>> to figure out where it ended) and does further slicing to do on _that_ result.
>
> More or less. The ${!var@} expansion is a special case, no question. What
> it does is pretty much what I described in my previous message, which you
> end up quoting below:
Which makes ${!@} a special case within a special case? Ends with @} but doesn't
trigger prefix logic. But that's just a length check. And ${!^@} can search the
list without ever finding a match. Which doesn't explain:
$ echo ${!1@}
bash: ${!1@}: bad substitution
> Start at the character following `{' and read to the end of the parameter.
> Look at the character you're on, and dispatch depending on that. If it's
> `@', do one more character of lookahead to see if you've got a prefix
> operator or a possible variable transformation.
>
> Once it figures out what the parameter part is, it does just what the man
> page says:
>
> "If the first character of parameter is an exclamation point (!), and
> parameter is not a nameref, it introduces a level of indirection. Bash> uses the value formed by expanding the rest of parameter as the new pa-
> rameter; this is then expanded and that value is used in the rest of
> the expansion, rather than the expansion of the original parameter."
I gotta test all this combining with nameref. Great.
>> (I understand the implementation reason. I don't understand the _design_ reason.
>> As a language... why?)
>
> I don't really care about the design reason. Whatever happened did so more
> than 40 years ago, and it's a waste of effort to worry about it.
Except I'm trying to figure out what behavior to implement, so I kind of have to
understand "why"...
>>>> $ echo ${PWD:1:3 at Q}
>>>> bash: PWD: 3 at Q: value too great for base (error token is "3 at Q")
>>>> $ echo ${PWD:1 at Q}
>>>> bash: PWD: 1 at Q: value too great for base (error token is "1 at Q")
>>>
>>> What's wrong with that? `@' and `Q' are valid characters in numeric
>>> constants when the base is large enough to need them. You can use
>>> bases up to 64 in base#constant.
>>
>> I.E. most expansions do not nest. The fact ${!x} does nest is an exception, and
>> ${!x@} is a special case within that exception.
>
> Yes, variable indirection is an exception.
One exception, yes. In bash, ${!!} doesn't indirect $!, ${#!} doesn't print the
length of $!, ${!@Q} doesn't quote the value of $!, and I still don't understand
why:
$ xx() { echo "${*@Q}";}; xx a b c d
'a' 'b' 'c' 'd'
$ xx() { echo "${@@Q}";}; xx a b c d
'a' 'b' 'c' 'd'
produce the same output instead of the first one saying 'a b c d'.
>>>> Hmmm...
>>>>
>>>> $ echo ${!potato at walrus}
>>>>
>>>> $ echo ${!P at walrus}
>>>
>>> Invalid transformations just expand to nothing rather than being errors.
>>> That's part of what I'm still experimenting with.
>>
>> What is and isn't an error is not consistent, yes. I STILL haven't found an
>> answer to my first question, which is what was the difference between:
>>
>> $ echo ${!potato* }
>> bash: ${!potato* }: bad substitution
>> $ echo ${!potato@ }
>>
>> $
>
> It really is the variable transformation. I'm not sure why you won't
> believe the answer.
It's not "won't believe", it's "don't understand". But I think I see what's
going on here: by "transformation" you mean the @Q stuff, so it sees the @
thinks it's going into @Q territory and hands off to code to parse that, which
then aborts because it isn't and _that's_ the error. But * never hands off to
anything because it's just unrecognized, so it works like ${!^} and ${!+} just
silently aborting and resolving to "".
How this applies to the difference between:
$ echo ${!^@}
$ echo ${!1@}
bash: ${!1@}: bad substitution
I do not yet understand.
I _think_ if bash says "bad substitution" and mine instead Does The Thing, that
can't introduce an incompatibility in existing scripts? I think?
> The debatable part is whether or not to short-circuit
> when the variable transformation code sees that the value it's being
> asked to transform is NULL, but that's what mksh does and one of the
> things I picked up when I experimented with the feature.
Does that produce a different result? Seems like it's just an optimization? (Do
you have a test case demonstrating the difference?)
>>>> Ok, when X exists and points to another variable, then !X becomes the contents
>>>> of that other variable and THEN the @ cares about trailing garbage. But * is a
>>>> different codepath from @...?
>>>
>>> It's a different expansion.
>>
>> In my code they're the same codepath with different $IFS behavior.
>
> OK. That's probably going to turn out to be insufficient.
Quite possibly. (Hence all the test cases.)
But I do not yet understand _why_...
>> $ chicken() { echo ${!*};}; fruit=123; chicken fruit
>> 123
>> $ xx() { echo "${!@:2: -1}";}; yy=abcdef xx yy
>> cde
>
> The character that ends the parameter is `@' and the character after that,
> which determines the expansion, is `:'.
"Ends the parameter" is the tricky bit for me, because my code handling all the
$! and such is after this (because ${!} and $! both trigger it), so determining
whether $: is a thing isn't something _that_ bit of parsing is set up to do (it
happens later).
$ xx() { echo "${!+:2: -1}";}; yy=abcdef xx yy
$
I may have to break it out into a function so I can call it from multiple places.
>> The tricky part is being sure:
>>
>> $ xx() { echo "${!@}";}; yy=abcdef xx yy
>> abcdef
>>
>> is doing the "indirection through $@" thing not the "list all variables with an
>> empty prefix" thing. (I.E. getting the special case checking in the right order.)
>
> There is no such thing as a variable with an empty prefix.
I.E. there is no syntax to show _all_ defined variables using ${!}.
> The parameter
> is !@; the ! introduces indirection, and the shell is left to do what it
> can with the `@'. What it does is expand it in a context in which word
> splitting does not take place, like on the rhs of an assignment statement.
Yeah, I definitely need to split out that function. Sigh.
So far the cases I have are:
${#} ${#x} ${#@} ${#[@]}
${!} ${!@} ${!@Q} ${!x} ${!x@} ${!x at Q} ${!x#} ${!x[*]}
All of which can have a slice afterwards (like :1:2 or /a/b), with the possible
exception of ${!x at Q:1:2} not being a thing but it's possible I'd have to special
case NOT supporting it...
Which still leaves me with stuff like:
xx() { echo ${!@@};}; xx a
Which.. I have no idea what it's doing? (The downside of the error behavior
being "resolve to empty string".)
$ xx() { echo ${!@@};}; xx a b c
bash: a b c: bad substitution
$ xx() { echo ${!@@};}; xx a
$ xx() { echo ${!@@Q};}; xx a
$ xx() { echo ${!@Q};}; xx a
bash: ${!@Q}: bad substitution
$ xx() { echo ${!*Q};}; xx a
bash: ${!*Q}: bad substitution
$ xx() { echo ${!*@Q};}; xx a
$ xx() { echo ${!*+};}; xx a
$ a=PATH; xx() { echo ${!@@Q};}; xx a
'PATH'
Sometimes it recognizes it enough to fail, and other times it silently fails
without saying why, and I have to rummage around quite a lot to see if there's
hidden behavior that _can_ be triggered...
>>> Invalid transformation operators just expand to nothing.
>>
>> Some things error, some things expand to nothing, If there's a pattern I don't
>> understand it yet.
>
> In this case, it's how mksh treats it and whether or not I wanted that much
> compatibility when I implemented it. So far, I've come down on the side of
> compatibility.
Not having played with or read the code of 6 other shells, I am sadly at a loss
here. I'm just trying to figure out a consistent set of rules for how bash
behaves. (Or what the minimum number of rules to cover the maximum amount of
bash behavior would be.)
> On the other hand, if my code handles cases yours errors on,
>> I'm not exactly being incompatible, am I?
>>
>> I mean I'm assuming:
>>
>> $ ABC=def:1
>> $ def=12345
>> $ echo ${!ABC}
>> bash: def:1: bad substitution
>> $ echo ${def:1}
>> 2345
>>
>> not working is maybe some sort of security thing,
>
> `def:1' is not a valid parameter name. In the second `echo', `def' is the
> parameter name.
I was trying to determine order of operations on evaluation: the indirection
results are not "live" for slice logic processing. Part of that whole:
$ a=a
$ echo ${!a}
a
$ declare -n a
$ echo $a
bash: warning: a: circular name reference
$ echo ${!a}
bash: warning: a: circular name reference
thing.
> although it makes:
>>
>> $ x=?; echo ${!x/0/q}
>> q
>> $ x=^; echo ${!x/0/q}
>> bash: ^: bad substitution
>>
>> a bit harder to get right since as I said, my code for recognizing $? is in
>> expand_arg() and not in varlen() or getvar().
>
> There are valid parameters, and there are invalid variable parameters.
The man page calls them "special" parameters.
>> But it expand_arg seems to be
>> where it belongs because there's a bunch of:
>>
>> $ declare -p 'x'
>> declare -- x="^"
>> $ declare -p '?'
>> bash: declare: ?: not found
>
> Do you think declare should do syntax checking on whether or not its
> arguments are valid identifiers? Even given that function names don't
> have to be valid identifiers?
"declare -p" without parameters does not list $! and $? $0 and friends. They're
a different category of variable. To honor that, I implemented that category of
variable not in the Big Array Of Variables but hardwired in the variable
resolution parsing logic. But then recognizing that ${?:2:7} and ${+:2:7} are
different becomes tricky for me. I think I need to split those hardwired
recognized names out into a function that reports how many bytes of input it
recognized. (Can be >1, ${12} is a thing but "$12" is "${1}2" and yes I need a
test for that...)
I think I can cheat with strchr(*s, "*@#?-$!_0123456789") to detect _that_ I
have a special parameter. Still gotta break resolution out into a function, and
@* is its own processing for both # and !...
>> incompatible sets of plumbing for doing the same thing:
>>
>> $ walrus='?'
>> $ declare -n walrus
>> bash: declare: `?': invalid variable name for name reference
>
> Yeah, that's what ksh93 does there. It rejects the attempt to make walrus
> a nameref, but it's still exists as a variable.
$ echo ${*@notanerror}
$ echo ${*-yetthisislive}
yetthisislive
$ echo ${*:potato}
bash
$ echo ${*=abc}
bash: $*: cannot assign in this way
Somehow, I don't think I'm going to run OUT of corner cases. This hole has no
bottom.
>> $ echo ${!walrus}
>> 1
>
> So the variable indirection works.
My point was they're mostly doing the same thing, but aren't compatible.
Rob
More information about the Toybox
mailing list