[Toybox] Would someone please explain what bash is doing here?
Rob Landley
rob at landley.net
Wed May 6 11:08:09 PDT 2020
On 5/5/20 5:33 PM, Chet Ramey wrote:
> On 5/5/20 1:47 PM, Rob Landley wrote:
>> On 5/4/20 1:16 PM, Chet Ramey wrote:
>>>> Still trying to work out what the "bash spec" would be, vs implementation details...
>>>
>>> I'll be interested when you get that spec done.
>>
>> I'd love to read it myself. Alas, it's corner cases all the way down:
>>
>> $ bash -c $'echo $LINENO\necho $LINENO'
>> 0
>> 1
>> $ bash <<< 'echo $LINENO'
>> 1
>
> Yes, line numbers should start at 1, even when the shell is reading a
> command from a string.
>
>>
>> I'm not currently writing a formal spec, I'm blogging a constant series of
>> complaints while trying to get the behavior right.
>
> You're blogging these bash corner cases, too?
I was. I was recently asked to stop.
> Anyway, give yourself some credit. Most of these things have been around
> for many years, and nobody ran across them until you did.
Which means they don't really matter. I'm just wondering what's gonna break
compatibility with existing scripts, and generally what the correct behavior
should be.
If you change them, that makes bash a moving target. I'm not trying to keep them
from you, but I'm only trying understand bash, not change it. Bash has been the
Linux shell for 30 years (as in Linus added system call support to his
boot-from-floppy terminal program in 1991 so it could run bash specifically as
documented in his semi-auto biography "just for fun"), so that's the shell
semantics I should implement in my simple self-contained Linux command line
utility set.
I'd use the man page as the spec, but when the man page documents $_ it says
"absolute path":
_ At shell startup, set to the absolute pathname used to invoke
the shell or shell script being executed as passed in the envi‐
ronment or argument list.
And that's not what bash does:
$ ln -s $(which bash) .
$ ./bash -c 'echo $_'
./bash
It's just set to the _pathname_ used to invoke the shell. Absolute or relative
doesn't matter. Of course if you follow that rathole too far:
$ cat > eggsalad.c << EOF
#include <unistd.h>
extern char **environ;
int main(int argc, char *argv[]) { execve(argv[1], argv+1, environ); }
EOF
$ gcc eggsalad.c
$ ./a.out bash -c 'echo hello $_'
hello ./a.out
How does it know... it's using the inherited environment variable _ isn't it?
$ cat > eggsalad2.c << EOF
#include <unistd.h>
char *env[]={0};
int main(int argc, char *argv[]) { execve(argv[1], argv+1, env); }
EOF
$ gcc eggsalad2.c
$ ./a.out bash c 'echo hello $_'
hello bash
THERE we go. No path at all. (Updates to the man page I'm happy to see, makes it
a better spec.)
(P.S. If you think I'm being meticulous, you should see what the actual security
professionals I used to follow on twitter do to this stuff: @aloria and
@hacks4pancakes and @fox0x01 and @0xabad1dea and @malwareunicorn and @evacide
and so on. Oddly enough, conventionally attractive cis women who work in tech
tend to be unusually motivated to eternal vigilance against ALL the exploits,
presumably because https://twitter.com/gabsmashh/status/1257434085879877632 and
https://tisiphone.net/2019/01/28/security-things-to-consider-when-your-apartment-goes-smart/
and https://twitter.com/aloria/status/1023271732344496128 and so on is their
daily experience in the tech industry.)
>> I currently have no IDEA what "sh --help" should look like when I'm done,
>
> I'm pretty sure bash --help complies with whatever GNU coding standards
> cover that option.
Currently 2/3 of bash --help lists the longopts, one per line, without saying
what they do. So yeah, that sounds like the GNU coding standards.
I was referring to toybox's built-in help text for each command, basically our
version of the man pages. You get the same text from "sh --help", "toybox help
sh", "toybox --help sh", and the "help" command at the sh prompt. They're also
on the web here:
https://landley.net/toybox/help.html#sed
In the source code, it comes from the menuconfig help text in the header block
at the start of each command file, in this case:
https://github.com/landley/toybox/blob/master/toys/pending/sh.c#L66
(The format of the header block is explained in
http://landley.net/toybox/code.html although not NEARLY as clearly as I'd like.
I should do youtube videos.)
Currently toybox "help sh" just says:
$ ./sh --help
usage: sh [-c command] [script]
Command shell. Runs a shell script, or reads input interactively
and responds to it.
-c command line to execute
-i interactive mode (default when STDIN is a tty)
Meaning I still haven't written most of the help text. I try to keep most of
them to fit on an 80x25 terminal screen, although a dozen or so go over. The
largest three are date (36 lines, and yes that's WITH three column output), find
(46 lines. again 3 columns), and sed (150 lines, I was exhausted by the time I
finished that and need to do another pass to shorten it).
I hope sh --help is NOT longer than sed's, but... we'll see? There's a lot to cover.
"If I had more time, I'd have written a shorter letter." - Blaise Pascal
> Chet
Rob
More information about the Toybox
mailing list