[Toybox] [PATCH] sh: pass "\" to the later app
Rob Landley
rob at landley.net
Wed Jul 5 00:29:39 PDT 2023
I have a window open with a half-finished reply in it, and if I've already
replied to this email I apologize...
On 6/19/23 18:32, Chet Ramey wrote:
> On 6/17/23 7:23 PM, Rob Landley wrote:
>> On 6/12/23 19:40, Chet Ramey wrote:
>>>> and they have a list of "special built-in utilities" that does NOT include cd
>>>> (that's listed in normal utilities: how would one go about implementing that
>>>> outside of the shell, do you think?)
>>>
>>> That's not what a special builtin means. alias, fg/bg/jobs, getopts, read,
>>> and wait are all regular builtins, and they can't be implemented outside
>>> the shell either.
>>>
>>> Special builtins are defined that way because of their effect:
>>>
>>> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_14
>>>
>>> It's really a useless concept, by the way.
>>
>> It's not that simple: kill has to be built-in or it can't interface with job
>> control...
>
> That's not what a special builtin is. `kill' is a `regular builtin' anyway.
I started down the "rereading that mess" path and it's turning into "reading all
the posix shell stuff" which is not getting bugs fixed. And once again, this is
a BAD STANDARD. Or at least badly organized. There's three groups here:
1) flow control commands: break, continue, dot, eval, exec, exit, trap, return.
2) variable manipulation commands: export, readonly, set, shift, unset.
3) random crap: colon, times.
Why group 1 doesn't include "wait" I dunno. Why group 2 has set but not read or
alias/unalias in it I couldn't tell you, and for that matter cd is defined to
set $PWD. Distinguishing : from true seems deeply silly (especially when [ and
test aren't) and "times" is job control (it's smells like a jobs flag, but
they're not including bg/fg here either which are basically flow control group 1
above).
And having "command" _not_ be special is just silly:
$ command() { echo hello; }
$ command ls -l
hello
There's only a few more commands like hash that CAN'T be implemented as child
processes, but they don't bother to distinguish them. I know there's the "this
may syntax error and exit the shell" distinction but don't ask me how set or
true are supposed to do that. (I _think_ they added set here because set -u can
cause a shell error later? Maybe? But then why unset? It doesn't seem to affect
flow control:
$ readonly potato=x; for i in one two three; do echo $i; unset potato; done
one
bash: unset: potato: cannot unset: readonly variable
two
bash: unset: potato: cannot unset: readonly variable
three
bash: unset: potato: cannot unset: readonly variable
I guess it's just the sh -c 'a=b set d=e; echo $a' nonsense which only dash
seems to bother with, which is a good reason _not_ to do it if you ask me...
In general, And this whole "can exit on error thing" doesn't seem hugely honored
even when posix says (implies) you can:
$ declare -i potato=1/0
bash: declare: 1/0: division by 0 (error token is "0")
$ declare -i potato
$ set potato=1/0
$ echo $potato
$
$ (set -x; echo hello ) 2>/dev/full
hello
$
Oh, by the way, I remember setting LINENO read only made the shell quite chatty,
but when I tested it just now it was ignored instead?
$ readonly LINENO
$ echo $LINENO
2
$ echo $LINENO
3
$ declare -p LINENO
declare -ir LINENO="4"
$ echo $LINENO
5
Hmmm, maybe it was...
$ source <(<<<$'readonly LINENO\necho$LINENO\necho $LINENO')
$ source <(echo $'readonly LINENO\necho $LINENO\necho $LINENO')
2
3
Nope, either there's version skew or I need to dig into my notes again. (Sigh, I
need to build current bash and test against that. If I'm going to experience
version skew from distro version upgrades _anyway_, I might as well treat it
like the kernel and try to notice changes early. Alright, bump up the Linux From
Scratch test environment todo list item...)
> (A prefix assignment... on continue? I can't
>> even do a prefix assignment on "if", and I have _use_cases_ for that. I had that
>> implemented and then backed it out again because it's an error in bash.
>
> `if' is not a builtin.
Sigh. I know:
$ abc=123 { echo $abc; }
bash: syntax error near unexpected token `}'
I keep writing scripts like that and having to fix it...
>> I
>> remember I did make "continue&" work, but don't remember why...)
>
> Why would that not work? It's just a no-op; no semantic meaning.
Not _quite_ a NOP:
$ for i in one two three; do echo a=$i; continue& b=$i; done
a=one
[1] 30698
a=two
[2] 30699
a=three
[3] 30700
[1] Done continue
[2]- Done continue
Notice the child processes and lack of b= lines.
No, if you want a NOP, put a flow control statement in a pipe:
$ for i in one two three; do echo a=$i; continue | echo b=$i; echo c=$i; done
a=one
b=one
c=one
a=two
b=two
c=two
a=three
b=three
c=three
>>> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02_03
>>
>> And I need parsing that eats \$ and \\n but leaves \x alone, great. (Which is a
>> new status flag for expand_arg(), can't be handled in a preparsing pass nor is
>> NO_QUOTE gonna do it right...)
>
> More characters than those two.
I've had this window open long enough at this point as a todo-item reminder to
myself that I no longer remember what the todo item WAS. Lemme see if I can
reverse engineer it. (I do this SO OFTEN with my notes-to-self...)
$ echo \x "\x" "\\" "\$"
x \x \ $
Backslash in double quote context leaves most characters alone but eats \ $ and
newline, and unquoted HERE documents are in double quote context.
$ cat<<EOF
> \a \b \c \$ \\ \d \
> EOF
> EOF
\a \b \c $ \ \d EOF
As far as I can tell, it's NOT more than \$ \\ and \<newline> that get special
treatment in this context? Plus $ expansions.
$ cat<<EOF
> <(echo hello)
> EOF
<(echo hello)
$ cat<<EOF
> <(echo $(echo potato))
> EOF
<(echo potato)
Yup, just the $ ones and those three \ ones?
>> Why is only the second of these an error in bash?
>>
>> $ unset ABC; echo ${ABC::'1+2'}
>> $ ABC=abcdef; echo ${ABC::'1+2'}
>> bash: ABC: '1+2': syntax error: operand expected (error token is "'1+2'")
>
> Because if there's nothing to operate on, bash doesn't try to process the
> rest of the word expansion (and if your first command is real, echo will
> output a single newline).
>
> This is consistent with POSIX:
>
> "If word is not needed, it shall not be expanded."
>
> even though the substring word expansion isn't POSIX.
And it's the short-circuit logic again:
$ echo $((1?2:(1/0)))
2
$ echo $((1&&(1/0)))
bash: 1&&(1/0): division by 0 (error token is "0)")
$ echo $((1||(1/0)))
1
I need to make sure I've got tests for all of this...
>> I think when the EOF is quoted the HERE body has no processing, and when it's
>> not quoted then $VARS \$ and \<newline> are the only special... Nope, \\ is too.
>
> Yes, since the body is treated like it's in double quotes, and, as quoted
> earlier, \ is one of the characters for which backslash retains its
> behavior as a special character. The double quote is the only exception;
> look at what these do:
>
> cat <<EOF
> echo "
> EOF
>
> cat <<EOF
> echo \"
> EOF
I hadn't put an "echo" in there, but I'd noticed that \" is already not removed
in HERE context. I'd _forgotten_ that it is in "abc" context.
Oh well, yay test cases. I think I need to reread this whole thread and make
sure I've copied all the test cases out of it into my regression test thingy. (I
need to reread the mailing list posts, my blog entries, by sh.tests file...
Lemme get through the major missing features people are actually asking for
first. :)
>> https://github.com/landley/toybox/commit/32b3587af261
>
> Ugh.
The difference between theory and practice is that in theory there's no
difference but in practice there is.
My favorite paper was somebody presenting a bunch of bugs they'd found in a
compiler that had been "mathematically proven correct". Circa 2010, no idea
where it's gone now, but my worldview includes bad ram and overheating power
supplies and rowhammer and so on, and as a consultant I've been called in to
debug things that turned out to literally be due to processor errata.
I remember once a company was doing an in flight entertainment system in java
that would work great for about a minute then died, and the problem was their
capitalist intellectual property department added a daemon that checksummed the
partition it was on (so if an airline they sold it to modified it in any way
without giving them more money it wouldn't work), and what was basically "cat
/dev/blockdevice" faulted a bunch of pages into the page cache which evicted the
java application (to make room for more disk cache) which caused enough memory
pressure to trigger Linux OOM killer. I had to give a presentation to a room
full of executives on how memory works in Linux. (I'd also told the engineers
they could use mmap() and madvise(MADV_DONTNEED) but they told me to keep that
out of their presentation because they wanted to use this to pressure management
to take out the partition checksum. Not a consultant's job to get involved in
the politics part, I told them what the technology was doing and left them to it...)
I later recycled some of that presentation material (for the engineers, not the
managers) into https://landley.net/writing/memory-faq.txt which is probably
hideously out of date by now... Anyway, the point was there was nothing
specifically wrong with the Java app, it just hit the software equivalent of
"turns out we live in a society, who knew?" that libertarians keep rediscovering.
>> still matching the behavior in my devuan install. (Still devuan bronchitis,
>> haven't updated to devuan cholera yet. Um, the web page says devuan B matches
>> debian "buster" and devuan C matches "bullseye", if that helps.)
>
> Not at all. But charming version names.
Devuan is "debian without systemd" so 99% just a wrapper around the existing
debian repositories intercepting the small number of packages that need to be
diddled, and once they decided they were stable they started doing alphabetical
release names like xubuntu did, and I never remember what the names are but
remember the letters. (The debian ones I look up because they don't have letters.)
I'm still using the B (ahem, "Beowulf") release. The C ("Chimaera") release came
out October 2021 but I haven't upgraded yet because B (from June 2020) is still
supported.
>> I was naieve enough to write the variable resolution logic with the design
>> assumption that unbalanced quoting contexts had already been caught before the
>> data was passed to us. Kinda biting me now, although I think I'm most of the way
>> through it.
>
> It was a pain to get that stuff right.
It's a pain to reproduce.
>> It doesn't handle nested logical contexts, and "case" logic has unmatched
>> ending parentheses that can end the $() span prematurely...)
>
> Ha. I had ad-hoc parsing that parsed $(...) for years, and it got more and
> more complex. I finally gave up on it for bash-5.2 and call the parser
> recursively to find the closing right paren (and replaced hundreds of lines
> of code with dozens). That's really the only way to do it correctly, but I
> was stuck with some compatibility issues because of how bash had not done
> it correctly in the past.
I have a vague todo item for that, but the problem is my data structures don't
recurse like that so I don't have a good place to stick the parsed blockstack
for $() and <() and so on, but it just seems wasteful to re-parse it multiple
times and discard it?
Yeah yeah, premature optimization. I'm fiddling with this stuff a bit anyway for
function definitions, but when you define a function inside a function my code
has a pass that goes back and chops the inner functions out and puts them in a
reference counted list and replaces them with a reference:
$ x() { y() { echo why; }; echo ecks; unset -f x; y; }; x; y; x
ecks
why
why
bash: x: command not found
I don't THINK I can do a local function, it's a global function namespace, they
outlive the text block that defined them, and you can still be running a
function that is no longer defined, so... reference counting. :P
But still, the pipeline list itself isn't what's nesting there. I think. And
given that arguments can be "abc$(potato)xyz" with the nested thingy in the
middle of arbitrary other nonsense, deferring dealing with that until variable
resolution time and then just feeding the string between the two () to
do_source() made sense at the time...
>> Happy to help. At the same time, trying not to spam you too badly...
>
> It hasn't been a problem so far.
He says, receiving a response to an email from weeks ago...
>>>>> The current edition is from 2018.
>>>>
>>>> Except they said 2008 was the last feature release and everying since is
>>>> bugfix-only, and nothing is supposed to introduce, deprecate, or significantly
>>>> change anything's semantics.
>
> When, by the way?
When did they say this? Sometime after the 2013 update went up, before the 2018
update went up. It was on the mailing list, but...
>>>> That's why it's still "Issue 7". The new stuff is
>>>> all queued up for Issue 8, which has been coming soon now since the early Obama
>>>> administration.
>>>
>>> Oh, I was there.
>>
>> I was lurking on the posix list since... 2006 I think?
>
> So you know that test now has `<' and `>' binary string operators that use
> the current locale, right? That's an example of what I'm talking about.
I was going to say "and when Issue 8 comes out, I may start to care" but I
already added them in commit 2407a5f51b58 last year because bash has them,
except most of toybox is not paying attention to locale. (Everything tries to
handle UTF-8 and unicode, but otherwise in the C locale.)
>> The project isn't dead, but those are defined as bugfix releases. Adding new
>> libc functions or command line options, or "can you please put cpio and tar back
>> in the command list", are out of scope for them.
>
> So wait for issue 8, I guess? It's going to start balloting this year.
It's been Real Soon Now since... I think 2018?
It was nice when posix noticed that glibc'd had dprintf() for years, it was nice
when they noticed linux had openat() and friends, but it was never a leading
indicator. When they removed "tar" and "cpio", Linux didn't. (Initramfs is cpio.
RPM is cpio.) Nobody installs "pax" by default.
Document, not legislate...
>> Ken or Dennis having a reason means a
>> lot to me because those guys were really smart. The Programmers Workbench guys,
>> not so much. "Bill Joy decided" is a coin flip at best...
>
> They all had different, even competing, requirements and goals. Mashey and
> the PWB guys were developing for a completely different user base than the
> original room 127 group, and Joy and the BSD guys had different hardware
> *and* users, and then the ARPA community for 4.2 BSD.
>
> Maybe things would be slightly different if Reiser's VM system (the one Rob
> Pike raves about) had been in 32/V and then eventually made it back to
> Research in time for 8th edition, but that's not the way it worked out.
The Apple vs Franklin decision extended copyright to cover binaries in 1983,
clearing the way for AT&T to try to commercialize the hell out of System III/V
and individually convince/threaten everybody with a BSD based system (after the
1980 arpa switchover from BBN IMPs to Vaxen) to rebase on System V ala
sunos->solaris and aos->aix and so on, and the resulting faceplant gave
microsoft an extra decade to get entrenched and really screw stuff up. (Plus
Paul Allen getting Hodgins Lymphoma and Gates and Ballmer scheming to get his
stock back when he died WHERE HE COULD HEAR THEM, and his drive to release Xenix
as DOS 4 left the company with him. And of course Bill declared unix anathema
and unloaded it on SCO when he found AT&T had filed the serial numbers off Xenix
code in System III the same way they got convicted of doing to BSE in BSDi's
countersuit...)
But I still think the main stake to the heart was the Bell Labs guys getting put
back in their bottle by AT&T legal, meaning nobody ever saw the Labs' Unix
Release 8-10, or got to look at Plan 9 before Y2K.
>> Working on it. (Well in busybox somebody else had already written an awk, I just
>> sent them bug reports and made puppy eyes. This time I have to learn how to use
>> "awk". And I have to write a "make". And a shell, which is in progress... :)
>
> Seems daunting.
I am regularly accused of trying to "boil the ocean".
Working on it...
>>> I wish you were not so reluctant. Look at how many things you've discovered
>>> that I decided were bugs based on our discussions.
>>
>> But I'm taking up your valuable time.
>
> I get to make that decision, don't I? I'm not shy -- I'll tell you if you
> send something dumb. Don't gatekeep yourself.
>
>> But since you asked, today's new question I wrestled with was "what's the error
>> logic for slice fields"?
I think this was more of the short-circuit logic stuff.
>> It's doing math, but only _sometimes_ even reporting division by zero as an error?
>
> See above.
>
>>>>> Single quotes: preserved. Double quotes: removed when special. For
>>>>> instance, the double quotes around a command substitution don't make the
>>>>> characters in the command substitution quoted.
>>>>
>>>> Quotes around $() retain whitespace that would otherwise get IFS'd.
>>>
>>> Correct, but that's behavior that affects how the output of the command
>>> substitution is treated, not how the substitution itself is parsed or
>>> executed.
>>
>> They're the same thing for me: my parsing produces a result.
>
> All parsing produces a result:
This is also the short-circuit logic stuff again.
> The question is whether $VAR is quoted in
>
> echo "$( for f in $VAR; do echo $f; done )"
>
> If you treat this like $( for f in "$VAR"; do echo $f; done ), you're going
> to have problems.
The result of variable expansion undergoes $IFS nonsense when unquoted, which
includes the weird liveness stuff ala:
$ ABC="a "; for i in $ABC""; do echo -$i-; done
-a-
--
The result of $(blah) and $BLAH are handled the same there? Quotes _inside_ the
subshell are in their own context.
>>>> (And "$@" is kind of array variable-ish already...)
>>>
>>> Kind of, but it's not sparse. Support for very large sparse arrays is one
>>> thing that informs your implementation.
>>
>> Oh goddess. (Adds note to sh.tests, which is my text file of cut and paste
>> snippets to look at later. Yes, my todo lists nest.) Is sparse array a new type
>> or are all arrays sparse?
>
> All indexed arrays are sparse (the question is meaningless for associative
> arrays). Indices that are set are set; indices that are not are unset.
>
> declare -a intarray
> intarray[12]=twelve
>
> doesn't automatically set intarray[0..11] to anything.
Hmmm... Smells a bit like indexed arrays are just associative arrays with an
integer key type, but I guess common usage leans towards a span-based
representation?
(The tricksy parts of implementing commands like this are the bits that I'm not
an experienced _user_ of. This is much easier when I know what success looks like.)
>> The variable types I've currently got are:
>>
>> // Assign one variable from malloced key=val string, returns var struct
>> // TODO implement remaining types
>> #define VAR_NOFREE (1<<10)
>> #define VAR_WHITEOUT (1<<9)
>> #define VAR_DICT (1<<8)
>> #define VAR_ARRAY (1<<7)
>> #define VAR_INT (1<<6)
>> #define VAR_TOLOWER (1<<5)
>> #define VAR_TOUPPER (1<<4)
>> #define VAR_NAMEREF (1<<3)
>> #define VAR_EXPORT (1<<2)
>> #define VAR_READONLY (1<<1)
>> #define VAR_MAGIC (1<<0)
>
>> WHITEOUT is when you unset a local variable so the
>> enclosing scope may have an unchanged definition but variable resolution needs
>> to stop there and get the ${x:=} vs ${x=} part right),
>
> You don't need that one, really. You can use the same value and logic you
> do when you have something like
>
> declare -i foo
> or
> export foo
>
> (unless you use WHITEOUT for this case as well).
I have a linked list of environment string arrays (actually a circular doubly
linked list), which variable lookup iterates through, checking at each level
until it hits the root context.
> `foo' exists as an unset variable, but when you assign a value to foo it
> gets exported since the attribute was already there.
Yeah, the distinction between ${?} and ${:?}, I remember some sort of:
$ x() { local WALRUS; export WALRUS; echo ${WALRUS?blah}; }; x
bash: WALRUS: blah
I have tests for this somewhere...
> You just have to be
> really disciplined about how you treat this `exists but unset' state.
$ export WALRUS=42; x() { local WALRUS=potato; unset WALRUS; WALRUS=abc;
> echo $WALRUS; env | grep WALRUS;}; x
abc
WALRUS=42
Ok, now I'm even more confused. It's exporting inaccessable values? (I know that
you can export a local, which goes away when the function returns...)
$ x() { local WALRUS=potato; export WALRUS; env | grep WALRUS;
> }; x; echo $WALRUS; env | grep WALRUS
WALRUS=potato
$
The exists-but-unset state you were referring to is whiteout, and it nests.
>> Anyway, that leaves VAR_ARRAY, and VAR_DICT (for associative arrays). I take it
>> a sparse array is NOT a dict? (Are all VAR_ARRAY sparse...?)
>
> The implementation doesn't matter. You have indexed arrays, where the
> subscript is an arithmetic expression, and associative arrays, where the
> subscript is an arbitrary string. You can make them all hash tables, if
> you want, or linked lists, or whatever. You can even make them C arrays,
> but that will really kill your associative array lookup time.
Eh, sorted with binary search, but that has its own costs...
> Asking whether an associative array is sparse doesn't make much sense;
> what would the definition of `sparseness' be? For indexed arrays, where the
> integer subscript imposes a bounded ordering, it makes sense.
Again, sounds like an indexed array is just an associative array with an integer
lookup key...
>> Glancing at my notes for any obvious array todo bits, it's just things like "WHY
>> does unsetting elements of BASH_ALIASES not remove the corresponding alias, does
>> this require two representations of the same information?
>
> There's no good reason, I just haven't ever made that work.
>
>
>> Spite: it keeps you going.)
>
> Misanthropy works.
I try to restrict myself to misandry. (We've collectively earned it.)
>>>> I remember being deeply confused by ${X at Q} when I was first trying to implement
>>>> it, but it seems to have switched to a much cleaner $'' syntax since?
>>>
>>> The @Q transformation has preferred $'...' since I introduced the
>>> parameter transformations in bash-4.4. I'm not sure when you were looking
>>> at it?
>>
>> I stuck with the last GPLv2 release for longer than Apple did:
>>
>> https://news.ycombinator.com/item?id=18852887
>
> But that version doesn't have parameter transformations, so that part is
> moot.
Which is why I was confused. :)
>>>>> They're not options, per se, according to POSIX. It handles -n as an
>>>>> initial operand that results in implementation-defined behavior. The next
>>>>> edition extends that treatment to -e/-E.
>>>>
>>>> An "initial operand", not an argument.
>>>
>>> That's the same thing. There are no options to POSIX echo. Everything is
>>> an operand. If a command has options, POSIX specifies them as options, and
>>> it doesn't do that for echo.
>>
>> Hence the side-eye. In general use, echo has arguments. But posix insists it
>> does not have arguments. To so save face, they've created an "argument that
>> isn't an argument", and they want us to pretend that's not what they did.
>
> Because the historical echo implementations were all incompatible -- and
> worse, irreconcilable. The POSIX folks did the least worst thing. They all
> exist just to make the behavior implementation-defined anyway.
I'm not sure continuing to make sure Dec Ultrix is included within posix helps
quite so much anymore? But I'm hugely biased: I've watched AIX 5l wander by (the
L stood for "linux compatible") and FreeBSD's Linuxulator and Solaris "Linux
Zones" and even microsoft doing WSL2...
If you pull up http://telly.org/86open out of archive.org, Linux binaries were
already considered a cross-platform binary standard in 1999 to the point where a
standards effort trying to come up with cross-platform intel binaries threw in
the towel because there already was one. These days we've got macosx,
free/open/net/dragonfly BSD, iOS, Android, a dozen flavors of linux... and
various embedded and legacy systems.
>>>> Right. So they're going from "wrong" to "wrong" then:
>>>>
>>>> $ echo -n -e 'hey\nthere'
>>>> hey
>>>> there$
>>>
>>> Yeah, echo is a lost cause. Too many incompatible implementations, too much
>>> existing code. That's why everything non-trivial (like the above) is
>>> implementation-defined. POSIX recommends that everyone use printf.
>>
>> $ printf abc\n
>> abcn$
>>
>> Oh yeah, that'll happen.
>
> What did you think would happen to the unquoted backslash?
I meant asking newbies to learn to use printf from the command line before echo
means they have to quote the argument and add \n on the end as part of "simple"
usage, which seems a fairly heavy lift.
>>>> Maybe posix should eventually break down and admit this is a thing? "ls . -l"
>>>> has to work, but "ssh user at server -t ls -l" really really REALLY needs that
>>>> second -l going to ls not ssh.
>>>
>>> Why do you think they don't acknowledge this today?
>>
>> https://landley.net/notes-2016.html#11-03-2016
>
> I don't understand how the two connect? Jorg was truly abrasive, and
> didn't endear himself to many people, but I don't see the connection to
> argument ordering here.
You asked why do I think posix doesn't acknowledge $THING today. My experience
with raising issues where posix and common usage seemed to have significant
daylight between them involved abrasive gatekeeping, resulting in me wandering
away again and leaving the historical memorial to HP-UX and A/UX and so on to
its own devices.
It's possible my experience was unusual?
>> (Yes, I'm aware of recent changes. That's why I re-engaged with Posix, felt I
>> owed it to them since the condition under which I said I'd come back
>> unexpectedly happened. But having already written them off, my heart really
>> wasn't in it. I _should_, but I'm juggling too many other balls...)
>>
>>> Options only exist as
>>> such if they come before the first non-option argument.
>>
>> $ cat <(echo hello) -E
>> hello$
>
> Yeah, looks like a bug in cat to me:
>
> $ cat <(echo hello) -E
> hello
> cat: -E: No such file or directory
>
> The GNU utilities do all sorts of argument reordering, but that doesn't
> mean you're going to get that in POSIX.
See "daylight between what posix says and what reality's been doing for
decades", above.
When I see reality does not match posix, I do not automatically conclude that
reality is wrong. There was SVID and COSE and OSF/1 and so on. I watched the LSB
die fairly recently.
When I want to ask questions, I have QEMU images with FreeBSD-13 and 2008-era
ubuntu and a knoppix from 2004 and old pre-fedora installs of Red Hat 9 and red
hat 6.2. (Plus some archlinux and gentoo images. The occasional install of
Centos or SuSE to reproduce somebody's bug tends to get deleted again
afterwards. Last year there was a mac I could ssh into to ask questions there,
helped me find a really _stupid_ bug I introduced...)
I also cross compile the stuff I'm building to a dozen architectures (each
tarball in https://landley.net/toybox/downloads/binaries/mkroot/latest/ is a
system image with a run-qemu.sh script that boots it to a shell prompt) to check
endianness and alignment and word size and so on, and test it under glibc and
musl and bionic C libraries. But I'm aware that's a "we have both kinds, country
_and_ western" kind of thing.
That's why I'm running all these tests against bash. It's real.
>>> Options have to
>>> begin with `-'.
>>
>> tar tvzf blah.tgz
>> ps ax
>> ar t /usr/lib/libsupp.a
>
> POSIX doesn't have `tar'.
Linux doesn't have pax. (Still not installed by default on any distro that I am
aware of.) And Linux _is_ tar, https://kernel.org has links to tarballs and "git
archive" produces two options: tar and zip.
YI am aware that posix got rid of its definition of tar (and was roundly
ignored), which is why my tar.c links to
http://pubs.opengroup.org/onlinepubs/007908799/xcu/tar.html at the top.
>> You can chain "ssh xargs strace nice unshare setsid timeout prlimit chroot"
>> arbitrarily deep, and each command has its own arguments and then a command line
>> it execs, which can itself contain arguments. That's usually WHY a command cares
>> about argument position.
>
> That's not inconsistent with the requirement that ssh options appear before
> other arguments.
My point was those are basically the only cases where that requirement exists.
The rest of them can "rm dir -r" and what posix says about it doesn't matter.
(And yes I have a TODO item to have wildcards expand to "./-file" as necessary...)
>>> If you really want to go
>>> hardcore, require that the application (user) supply a `--' before the
>>> remote command and its arguments if you want to use it in this way.
>>
>> But what's already there works, and has for decades.
>>
>> A good standards body should document, not legislate.
>
> Where do you think the utility syntax guidelines came from?
There are instances where they've been good, yes. Removing tar was "legislate,
not document" and they explicitly refused to acknowledge that it was a mistake
over a decade later.
>> This sort of thing consumes my "engaging with bureaucracy" meter.
>
> You can't force volunteers to do anything. They're volunteers!
I'm aware how volunteers work, yes. The above example was "they said they would,
then didn't, repeatedly". I'm not saying it's malicious, or even dysfunctional.
Just tiring.
> It's not bureaucracy,
The FSF required signed paper copyright assignments to be filed with the boston
office for decades. The "cathedral" in "The Cathedral And the Bazaar" was the
GNU project, as mentioned in the paper's abstract on the 1998 Usenix website
https://www.usenix.org/conference/1998-usenix-annual-technical-conference/software-development-models-cathedral-and-bazaar
(later versions switched to a more open vs proprietary framework, but the author
wrote it comparing his different experiences maintaining the emacs lisp library
and participating in Linux development).
Linux's "signed-off-by" has evolved into an 871 line procedure at
Documentation/process/submitting-patches.rst in the kernel source, supplemented
by a 24 step "patch submission checklist" at
Documentation/process/submit-checklist.rst and since about 2003 there's been a
multi-step approvals process where a developer gets a sign-off from a maintainer
who gets a sign-off from a subsystem maintainer ("lieutenant") who forwards the
submission to Linus.
It's kinda bureaucracy-ish.
Those Linux developments aren't exactly _bad_, and I was involved in both of
them happening. I was hip deep in helping navigate the SCO mess for a while
(http://www.catb.org/~esr/hackerlore/sco-vs-ibm.html was more Eric than me, but
http://www.catb.org/~esr/halloween/halloween9.html was more me than Eric). And I
catalyzed the scalability change with a "don't ask questions, post errors" rant
in 2002 that got noticed
(https://www.zdnet.com/article/row-brewing-over-linux-patches/) and there was a
long discussion on the list (and off-list,
https://firstmonday.org/ojs/index.php/fm/article/view/1105/1025 and
http://es.tldp.org/Presentaciones/200211hispalinux/rusty/seminar.pdf for
example), and the result was A) Linus started using source control (Bitkeeper
for a while, then he wrote git), B) they formalized the multi-layer editorial
review process for patch submissions.
But the point is, that culture has changed to the point I no longer have the
bandwidth to usefully participate in it. Which is normal, and I write about how
this happens to communities back when I wrote stock market investment columns in
1999, which also got noticed (I'm told my original series was read by 15 million
people its first week):
https://firstmonday.org/ojs/index.php/fm/article/view/839/748
You can still sort of fish them out of fool.com if you really try, ala
https://www.fool.com/specials/2001/02/21/how-companies-evolve-sp010221c.aspx but
their old archive is swiss cheese and my mirror at
https://landley.net/writing#3waves or my attempt to update it in 2011 in my blog
(https://landley.net/notes-2011.html#01-12-2011) on Dec 1, 2, 4, 5, and 6. (Dec
3 was unrelated.
Sigh. It's really hard to talk about this stuff because there is SO MUCH
BACKSTORY just to provide context. A simple statement like "The Linux Foundation
drove the hobbyists out of Linux development" requires like 5 supporting links
each to a longish essay (good grief,
https://landley.net/notes-2010.html#18-07-2010 is 13 years old now, and by no
means the last word. The squashfs author did a great one in a linux weekly news
comment https://lwn.net/Articles/563578/ even thought that's not the topic he
was TRYING to write on...)
>> http://www.opengroup.org/testing/downloads.html says there's a no-fee license.
>> Maybe closer to the 1.0 release I'll jump through the hoops to help me document
>> my deviations?
>
> Think carefully about doing that. It takes a lot of time, and I only did
> the shell and builtins tests.
I have a whole bunch of blue sky todo items, but my _focus_ is getting A)
Android self-hosting, B) keeping them happy enough they keep taking my code so I
can eventually trail-of-breadcrumbs them all the way to self-hosting. (Android
needs to build under Android. Plug the phone's USB port into a USB hub with a
keyboard and mouse and HDMI adapter or chromecast to put the display on a big
HDTV, maybe add a USB hard drive and ethernet adapter for your cable modem if
you're feeling posh. If the phone can't replace the PC then people who DON'T
have PCs are second class citizens in the development community.)
https://landley.net/toybox/about.html
Back when I did busybox, I was replacing the Linux From Scratch packages with
busybox, and I eventually got a system built from just 7 packages (linux,
busybox, uclibc, gcc, binutils, make, bash) to rebuild itself under itself from
source code, and build Linux From Scratch and chunks of Beyond Linux From
Scratch under the result. (And it cross compiled for a dozen architectures and
ran the build under QEMU, and could call out to the cross compiler running on
the host via distcc through the emulated network to move the heavy lifting of
copilation outside of the emulator to speed things up...)
https://landley.net/aboriginal/about.html#design
That worked, meaning I know what success _used_ to look like. Pity it's a moving
target...
My previous goal was to bootstrap distros like debian/gentoo/fedora under the
resulting Linux From Scratch system in a host-agnostic way (so the build doesn't
_care_ that it's on sh4 or mips, it's just a native configure/make/install with
the provided toolchain), ideally leaving busybox providing as much of the
command line as possible. In practice, distro bootstrapping turned out to be
insanely brittle and undocumented, to the point I spoke to the fedora build
manager at a conference (Flourish 2016) who explained that Fedora 24 wouldn't
build under Fedora 22, it was a self-referential continuous integration thing
("perpetual beta" got renamed "rolling release" at some point) where what it
actually built under was semi-undocumented snapshot du jour:
https://landley.net/notes-2016.html#02-04-2016
And that was still better than the horrors of gentoo! (I met with Daniel Robbins
in person a couple times and we tried to get stuff to work, but that was after
he left gentoo and started funtoo).
Eventually the Alpine Linux guys came along and built a distro around the work
I'd done (after I'd already left it behind, but hey). And the _least_ crazy
distro bootstrapping turned out to be debian, strangely enough. (Still pretty
crazy, but not AS crazy. You can debootstrap on a non-debian system, and the
repo format is more or less a web server with a defined directory structure and
some index files, although it's still hard enough to reverse engineer that
nobody's made a musl-based debian system yet so _I_ don't have to...)
Once I got the basic system working, my further goals were:
https://landley.net/aboriginal/about.html#next
And in 2011 I retargeted based on Android:
https://landley.net/notes-2011.html#13-11-2011
Which caused people with no skin in the game to get really angry:
https://lwn.net/Articles/478308/
("In the beginning, the universe was created. This has made a lot of people very
angry and been widely regarded as a bad move." - The Hitchhiker's Guide to the
Galaxy.)
So now I'm trying to turn Android into a self-hosting development environment
(bootstrapping AOSP under my minimal system instead of something like debian).
Android has a "no GPL in userspace" policy (thank you EVER so much FSF for
shoving GPLv3 down everybody's throats in 2007 and having the GPLv2 baby thrown
out with the GPLv3 bathwater). So the ONLY package of the aboriginal 7 I can
still use is the Linux kernel. Luckily I talked the musl-libc maintainer into
switching to a BSD license shortly after he published what had been his personal
project (he read my blog, and asked about
https://landley.net/notes-2012.html#18-01-2012 and we had a long talk on IRC
about it), although bionic has also filled in 2/3 of the gap between it and musl
since 2015. And LLVM started seriously trying to build the linux kernel in 2011,
and that's reasonably load bearing now (with Android deprecating gcc in 2016 and
removing it as a build option in 2020, they're all LLVM all the time now). I
still need to implement all the commands I was using out of busybox:
https://landley.net/toybox/roadmap.html#dev_env
Plus make and bash, which can't be external gpl packages _and_ ship in the
android base image.
So the current path to self-hosting AOSP starts with me reproducing a modern
Linux From Scratch build under mkroot (toybox's new built-in system builder, a
327 line bash script) to make sure I've got the tools fleshed out. And because
LFS is much smaller than AOSP, but in theory AOSP (or at least the prerequisite
packages AOSP needs) should build under it.
Also, the AOSP build is kinda incestuously tied to git, so to build AOSP toybox
probably needs to implement git. (There's a talk about build environment vs
development environment I've given repeatedly before and won't reproduce here.)
I hope it DOESN'T need ninja (if I can still build ninja from source with make...)
And of course there's the "countering trusting trust" part:
http://lists.landley.net/pipermail/toybox-landley.net/2020-July/011898.html
Posix is a sideline to all this: the important thing is does it WORK?
>>>>> This is completely unspecified behavior.
>>>>
>>>> The standard is not complete, yes.
>>>
>>> A different interpretation. There's plenty of unspecified and
>>> implementation-defined behavior.
>>
>> Bash is an implementation, defining behavior. There may be version skew, but it
>> does something specific. I just have to think of what questions to ask.
>
> That's not the same thing. More useful for your purposes, maybe, but still
> different.
I wist for a better standard. Posix isn't it.
>> Currently. Posix didn't always exist, the Linux Standard Base was making good
>> progress until the Linux Foundation's accretion disk swallowed it, man7.org was
>> decent until Michael Kerrisk retired and handed off to a guy who doesn't
>> maintain a current web version...
>
> If all you're interested in is Linux, then sure.
The majority of the toybox commands also run on MacOS and FreeBSD, and people
have ported it to QNX and such (from which I get bug reports, but nobody's sent
me a build/test environment or a defconfig indicating which commands work for
them). If the AIX guys wanted to engage with me, I'm not hard to find.
But I'm not trying to get those self-hosting. Nor am I trying to become their
standard command line utility set (although if somebody like QNX wanted to talk
about that I'm game; FreeBSD is committed to its history and Apple is its own
ecosystem behind a paywall topped with spikes and broken glass).
>> For years the de-facto spreadsheet standard was Microsoft Excel and the word
>> processing file format standard was Microsoft Word. They SUCKED, but had vastly
>> dominant market share. And every weird corner cases of their behavior was part
>> of that standard.
>>
>> Then Star Division cloned compatible versions that could read and write those
>> files in Star Office,
>
> Yes, I used Star Office when I ran FreeBSD on my desktop for a while.
"Ah yes, Team OS/2. Those pink things running about."
C64->Amiga->DOS->Desqview->OS/2->Red Hat->Knoppix->Ubuntu->Devuan. In between
Desqview and OS/2 there were a couple years of Solaris workstations at Rutgers,
but they cost more than my car and I never particularly desired to own five
figures of hardware depreciating at 50% every 18 months. (Sheesh, with inflation
probably six figures in today's "The US Dollar is a trademark of Visa and
Mastercard".)
My response to Windows was to wait for it to end. My response to AOL was to wait
for it to end. My response to Faceboot was to wait for it to end. With a
whitebox PC I could switch vendors, with a whitebox Linux I could switch
distros. I'm not one of those people who found a lungful of air they liked in
their 20's and refused to change it ever since. You can't start life on an 8-bit
system and NOT constantly have an exit strategy.
Although giving up the GPL was hard. I went through the Kublher-Ross stages of
grief circa 2009 (https://landley.net/notes-2009.html#09-05-2009 and such)
before doing 0BSD and getting it into SPDX and Github's choose-a-license and so
on...
>> The point is, once you have two independent implementations, the subset both
>> support becomes a lot more standard-shaped. This was the IETF way back in the
>> day, "rough consensus and running code". The bake-offs were to get multiple
>> interoperable implementations. You NEED two, or this doesn't work. :)
>
> Sure. But when you get beyond two, that intersecting subset becomes a lot
> smaller, and the number of parties with skin in the game gets a lot larger.
> That's why you have so much implementation defined behavior in the
> standard. If you want to walk the road you did, and say "this
> implementation is the standard one for me," then that's fine, but you're
> not going to be successful getting other implementations to walk that same
> road a lot of the time.
The people who run Linux binaries on those other operating systems could build a
support layer because they knew what success looked like and were sufficiently
motivated to make it work. Admittedly, given the timeframe most of them started,
"the flash plugin" was a big driver, but windows running unbuntu binaries was
almost certainly because they wanted Azure to run Linux containers...
>>> What are you using now?
>>
>> $ bash --version
>> GNU bash, version 5.0.3(1)-release (x86_64-pc-linux-gnu)
>
> Jesus, your distro can't even be bothered to apply all the patches for a
> single version?
Devuan is a thin skin over Debian, when I ask about this sort of thing on the
#devuan libra.chat channel they point me at
https://packages.debian.org/search?keywords=bash and similar.
(I added the backports repo to my etc/apt/sources.list but I have to point at a
specific package and tell it to grab the new version of that. Otherwise they
expect me to get non-bugfix upgrades via dist-upgrade, and I always do a full
backup and reinstall when moving major versions because I despise unreproducible
magic dependencies. Which also means I put them off until forced, so... six of one?)
> $ ../bash-5.0-patched/bash --version
> GNU bash, version 5.0.18(10)-release (x86_64-apple-darwin18.2.0)
>
> This is what makes getting bug reports difficult.
Elliott shields me from most of this for Android. :)
(I've tried copying a toybox binary onto my phone with adb a couple times, but
when I do it hasn't got permissions to _do_ anything. Yeah yeah, the evil maid
problem, I know. I keep meaning to root my phone, but I use it as a phone and
every time I get a new phone and try to root the old phone it never goes well. I
break everything, and AOSP inevitably drops support for the old hardware I'm
using LONG before I migrate off of it. I tend to stick with phones for years
after the updates stop. The one before this got rained on. The one before that
the screen smashed and replacement parts for it were no longer manufactured.
"The cobbler's children have no shoes"...)
> Chet
Rob
More information about the Toybox
mailing list