[Toybox] bash continues to confuse me.

Rob Landley rob at landley.net
Thu Jun 18 16:48:53 PDT 2020


On 6/18/20 1:46 PM, Chet Ramey wrote:
> On 6/17/20 1:22 PM, Rob Landley wrote:
>> Trying to figure out when spaces are and aren't allowed in ${blah} led to asking
>> why echo ${!a* } is an error but ${!a@ } isn't (when there are no variables
>> starting with a), and I eventually worked out that:
>>
>>   $ X=PWD
>>   $ echo ${!X at Q}
>>   '/home/landley/toybox/clean'
>>
>> Is going on? 
> 
> It's variable transformation. It's introduced by `@' and uses single-letter
> operators. I cribbed the idea from mksh and extended it. The syntax is kind
> of loose because I'm still experimenting with it.

I know what the @Q part does:

  $ chicken() { echo "${@@Q}"; }; chicken one two three
  'one' 'two' 'three'

It's the circumstances under which ${!abc***} falls back to processing the ***
part that I'm trying to work out.

  $ ABC=123; echo ${!ABC:1:2}
  21

I _think_ what's happening is when you do ${!ABC@} it specifically checks for
"@}" after the variable name, and if so it lists prefixes. (And [@] does the
array thingy, haven't checked if that needs the } yet.) But if it doesn't end
with THAT SPECIFIC SEQUENCE, it does the !ABC substitution FIRST and takes
whatever's left after the original variable name (using normal "$WALRUS-x" logic
to figure out where it ended) and does further slicing to do on _that_ result.

The hard part is the sequencing so I can work out order of operations. My
varlen() function doesn't recognize $@ and friends, that's an else case within
the variable expansion function. But if I moved $? and such from expand_arg() to
getvar() then {?}<&- would work and that's an error.

I really really really want to use common plumbing for stuff based on simple
consistent rules, which sadly means figuring out _where_ to be incompatible with
bash. :(

> Which is just weird, both because:
>>
>>   $ echo ${PATH@ }
>>   bash: ${PATH@ }: bad substitution
>>
>> And because:
>>
>>   $ ABC=defghi; echo ${#ABC at Q}
>>   bash: ${#ABC at Q}: bad substitution
> 
> The #variable is the entire expansion, not the `parameter' part of an
> expansion.

I.E. ${!x***} falls back but ${#x***) does not.

  $ echo ${#ABC:0:1}
  bash: ${#ABC:0:1}: bad substitution
  $ echo ${#ABC/3/x}
  bash: ${#ABC/3/x}: bad substitution

I do not understand the reason for the difference, especially since:

  $ echo ${#/0/z}
  z

(I understand the implementation reason. I don't understand the _design_ reason.
As a language... why?)

>>   $ echo ${PWD:1:3 at Q}
>>   bash: PWD: 3 at Q: value too great for base (error token is "3 at Q")
>>   $ echo ${PWD:1 at Q}
>>   bash: PWD: 1 at Q: value too great for base (error token is "1 at Q")
> 
> What's wrong with that? `@' and `Q' are valid characters in numeric
> constants when the base is large enough to need them. You can use
> bases up to 64 in base#constant.

I.E. most expansions do not nest. The fact ${!x} does nest is an exception, and
${!x@} is a special case within that exception.

>> Hmmm...
>>
>>   $ echo ${!potato at walrus}
>>
>>   $ echo ${!P at walrus}
> 
> Invalid transformations just expand to nothing rather than being errors.
> That's part of what I'm still experimenting with.

What is and isn't an error is not consistent, yes. I STILL haven't found an
answer to my first question, which is what was the difference between:

  $ echo ${!potato* }
  bash: ${!potato* }: bad substitution
  $ echo ${!potato@ }

  $

>>   $ echo ${!P@}
>>   PATH PIPESTATUS PPID PS1 PS2 PS4 PWD
> 
> Yep, variable transformation isn't in effect when the expansion ends with
> the `@'.

That's one of the special cases.

>>
>>   $ X=PATH
>>   $ echo ${!X@ }
>>   bash: ${!X@ }: bad substitution
>>
>>   $ ABC=123
>>   $ echo ${!A at walrus}
>>
>>   $ echo ${!ABC@}
>>   ABC
>>   $ echo ${!ABC at walrus}
>>
>>   $ echo ${!A@ }
>>
>> Ok, when X exists and points to another variable, then !X becomes the contents
>> of that other variable and THEN the @ cares about trailing garbage. But * is a
>> different codepath from @...?
> 
> It's a different expansion.

In my code they're the same codepath with different $IFS behavior. There are 3
places in IFS expansion that check for '*' to distinguish it from '@' during
array-style IFS expansion:

https://github.com/landley/toybox/blob/master/toys/pending/sh.c#L998
https://github.com/landley/toybox/blob/master/toys/pending/sh.c#L1024
https://github.com/landley/toybox/blob/master/toys/pending/sh.c#L1041

And that third one should really just be checking *sep set by the first one.
(And possibly the second one should be too. If IFS starts with an invalid utf8
sequence it won't _set_ sep, which affects the second but not the third check,
but I should fix that in the first check's body...)

Anyway, my code's in flux by trying hard to use common codepaths. Which is much
easier when there's consistent behavior.

>> And when the ${!var} doesn't have contents pointing to another existing
>> variable, it falls back to a codepath that tries the prefix stuff (and/or the
>> array stuff, in some order?) and that codepath tries to parse trailing @Q and
>> friends, but DOESN'T error out if it can't recognize the trailing stuff after the @?
> 
> Not really, no. It grabs the parameter, which may begin with `!', looks at
> the character following it, does one character of lookahead if that
> character is `@', and branches to the appropriate thing: variable prefix
> expansion or indirect expansion with the result being used as the parameter
> for the rest of the expansion.

  $ chicken() { echo ${!*};}; fruit=123; chicken fruit
  123
  $ xx() { echo "${!@:2: -1}";}; yy=abcdef xx yy
  cde

The tricky part is being sure:

  $ xx() { echo "${!@}";}; yy=abcdef xx yy
  abcdef

is doing the "indirection through $@" thing not the "list all variables with an
empty prefix" thing. (I.E. getting the special case checking in the right order.)

Rather a lot of this shell work measures progress in "tests added" rather than
"code written"...

>> I'm pretty sure the array mangling logic ties in here somehow, but there's some
>> missing error checking somewhere...
> 
> Invalid transformation operators just expand to nothing.

Some things error, some things expand to nothing, If there's a pattern I don't
understand it yet. On the other hand, if my code handles cases yours errors on,
I'm not exactly being incompatible, am I?

I mean I'm assuming:

  $ ABC=def:1
  $ def=12345
  $ echo ${!ABC}
  bash: def:1: bad substitution
  $ echo ${def:1}
  2345

not working is maybe some sort of security thing, although it makes:

  $ x=?; echo ${!x/0/q}
  q
  $ x=^; echo ${!x/0/q}
  bash: ^: bad substitution

a bit harder to get right since as I said, my code for recognizing $? is in
expand_arg() and not in varlen() or getvar(). But it expand_arg seems to be
where it belongs because there's a bunch of:

  $ declare -p 'x'
  declare -- x="^"
  $ declare -p '?'
  bash: declare: ?: not found

and it's not even a magic like $RANDOM, it's basically an escape sequence.
Except when it isn't (ala ~15 lines back in this email). And there's different
incompatible sets of plumbing for doing the same thing:

  $ walrus='?'
  $ declare -n walrus
  bash: declare: `?': invalid variable name for name reference
  $ echo ${!walrus}
  1

And alas:

  https://www.youtube.com/watch?v=2t-hyB8ibgk

Is not a HELPFUL reaction, so I add tests to tests/sh.test instead...

Rob


More information about the Toybox mailing list