[Toybox] hexdump tests.

Fri Mar 22 16:22:51 PDT 2024

On 23/3/24 07:02, toybox-request at lists.landley.net wrote:
> Date: Fri, 22 Mar 2024 13:02:18 -0700
> From: enh <enh at google.com>
> To: Rob Landley <rob at landley.net>
> Cc: toybox at lists.landley.net
> Subject: Re: [Toybox] hexdump tests.
> Message-ID:
> 	<CAJgzZopopatRv6Kx8HnHc0x5SL=kpFs7eSRGf9Mz7LX7NjaDgA at mail.gmail.com>
> Content-Type: text/plain; charset="UTF-8"
>
> On Mon, Mar 18, 2024 at 6:04?AM Rob Landley <rob at landley.net> wrote:
>> On 3/15/24 16:24, enh wrote:
>>>      Sure, but that said some tests _DO_ care about the exact amount of whitespace
>>>      (are columns aligned), or tabs vs spaces.
>>>
>>> i know what you mean, but at the same time, i'm struggling to thing of a single
>>> case i've been involved with where the "upstream" tool hasn't screwed me over by
>>> doing something stupid sooner or later...
>> Yup. And yet...
>>
>> I'm thinking maybe strip _trailing_ whitespace? It's not user-visible and I
>> can't think of an instance where it's semantically relevant. (LEADING whitespace
>> is semantically relevant all the time, interstitial a lot too. But trailing
>> generally shouldn't BE there...)
>>
>>> CANONICALIZE_SPACE_IF_RUNNING_HOST_VERSION=1? so we trust ourselves but no-one
>>> else? :-)
>> I _don't_ trust myself, and I'm not special. (That's policy.)
> yeah, but that's why i suggested
> CANONICALIZE_SPACE_IF_RUNNING_HOST_VERSION --- that way we can say "we
> can't make hard assertions about the _host's_ whitespace, but we can
> still make hard assertions about _ours_". if we just canonicalize all
> the whitespace all the time, we can't (say) ensure that columns line
> up or whatever.
>
>>>      The problem is "dump hex" isn't a big enough job that pulling it out into a
>>>      library function that can be shared is really a win. It's another one of those
>>>      "the fighting is so vicious because the stakes are so small" things. Maybe if I
>>>      could genericize the "show hex in 4 digit groups, now do octal!" variants into
>>>      some sort of engine... but I worry that the glue to call the engine would be
>>>      bigger than any savings.
>>>
>>> od and hexdump are weird there in that the former lets you express quite a large
>>> variety of different dumps, and the latter (i think) pretty much anything. i
>>> have wondered whether the others can't mostly be written in terms of hexdump.
>>> (xxd still has all the reverse stuff, but as long as no-one else does, that's
>>> not duplication.)
>> Yeah, it _seems_ like there's something I can do there, but I'm tired of being
>> distracted by it.
>>
>>>      *shrug* Punt that for a potential post-1.0 cleanup pass, and lump it in the
>>>      meantime...
>>>
>>> yeah, like you say, these are some of the simplest commands anyway. i'd be a lot
>>> more worried if we had four seds or four shells :-)
>> At the end of my tenure, busybox had FIVE shells, although that last one was my
>> fault and two of them were the "xkcd standards" problem.
>>
>> Erik did lash (lame-ass shell) to be tiny, Ash was the bigass lump of complexity
>> copied out of debian or some such and nailed to the side of the project by that
>> insane Russian developer who never did learn english and communitcated entirely
>> through a terrible translator program (so any conversation longer than 2
>> sentences turned into TL;DR in EITHER direction, he was also hugely territorial
>> about anybody else touching "his" code), and msh was the minix shell mostly used
>> on nommu systems.
> did lash _stay_ tiny? i feel like the trouble with projects like that
> is usually that no-one can agree on what's necessary versus bloat, so
> you trend towards just being a bad implementation of whatever. iirc
> inferno had _two_ different "tiny" shells.
>
>> Somebody then started hush as the "one shell to rule them all" replacement but
>> work on it petered out. Not sure whose baby that was because the entire busybox
>> community collapsed at about the same time: Erik Andersen ran a startup and got
>> so overworked his marriage nearly collapsed, Manuel Nova's girlfriend died,
>> Glenn McGrath tried a GPL enforcement action down in australia/new zealand and
>> it left such a bad taste in his mouth he quit open source development entirely,
>> Mike Frysinger started maintaining seperate for-profit forks of every project he
>> touched and never pushing anything upstream which eventually resulted in the
>> blackfin architecture (his dayjob) being declared dead and yanked from
>> linux/arch and never even making it into qemu... And that's ignoring the whole
>> uclibc->buildroot saga...
>>
>> *shrug* Hush dying was pretty minor in context: the busybox community imploded
>> and I stepped in to prop up what I could until Bruce went "you, volunteer who is
>> mopping the floors, you're doing it wrong, do it MY WAY, I have _seniority_ and
>> you've been doing everything in my name all along anyway whether you know it nor
>> not"...
>>
>> Anyway, before all that happened I printed out the bash man page into a 3 ring
>> binder to read on the bus and started my own "one shell to rule them all",
>> bbsh.c, and work ended on that when bruce chased me off busybox. Denys removed
>> it pretty early on in his tenure, but as far as I'd gotten was what was checked
>> in to pending until the current round of shell work started...
>>
>>>      Yes I saw your email in the other thread about pending not being granular
>>>      enough, but didn't really have anything coherent to say in response? I see
>>>      pending as an unfinished todo heap I need to drain, and I feel bad for not
>>>      cleaning it up fast enough. Doing non-cleaning work there is like organizing
>>>      trash piles. Attempting to categorize the bulk wasn't an unambiguous win even
>>>      for toys/ which is _intended_ to keep growing rather than shrink, so adding it
>>>      to pending doesn't appeal. I don't really want spend architectural design cycles
>>>      on scaffolding that gets torn down again.
>>>
>>> indeed.
>>>
>>> i think the only half-way practical idea i had was "keep pending but just switch
>>> to a much scarier name".
>> I need to clean it all up. I just haven't quite gotten my groove back
>> post-pandemic and people keep submitting distracting bug reports for the
>> existing code. (The downside of having users: they find stuff.)
>>
>>> because, to be fair to the confused, in english
>>> "pending" _can_ legitimately mean "almost there". whereas your whole point with
>>> pending is "i actually have _no_ idea how close this is yet".
>> Linux has drivers/staging but I didn't like that.
> yeah, "staging" also sounds very much like "nearly there!".

What about using the very old ibm standard "workingonit" directory, lets 
people know where your at and still accepting 'usefull' submissions.

Werm't always usefull, but did cut the dross down  quite a bit when 
first implemented.

jon

>
>>> if i _had_ to do
>>> something today, renaming "pending" to "experimental" is probably where i'd
>>> land.
>> Which has the same "git log/annotate doesn't follow renames by default" problem
>> that moving everything BACK out of toys/*/blah.c into toys/blah.c has. (And
>> there's no short option to do it either.)
>>
>>> but then this would look rather suspect over in the aosp build system:
>>>
>>>      "toys/pending/diff.c",
>>>      "toys/pending/expr.c",
>>>      "toys/pending/tr.c",
>>> ...
>>>      "toys/pending/brctl.c",
>>>      "toys/pending/getfattr.c",
>>>      "toys/pending/lsof.c",
>>>      "toys/pending/modprobe.c",
>>>      "toys/pending/more.c",
>>>      "toys/pending/stty.c",
>>>      "toys/pending/traceroute.c",
>>>      "toys/pending/vi.c",
>>>
>>> :-)
>> It _is_ somewhat suspect. But then so is mkroot enabling ifconfig and sh.
>>
>>> oh, btw, i realized next time i tried why i've struggled to make myself do
>>> `toys/*/foo.c` ... it's because the shell won't tab-expand through the `/*/`
>>> even if only one directory matches, and i tab complete all the time :-(
>> That is a downside, yes.
>>
>> However, if everything was in one big directory, none of it would be in pending
>> or example...
> yeah, to be clear: i just meant all the default 'y' stuff.
>
>> Rob
>>
>> P.S. I need to do more work on the shell conditional execution stuff, because:
>>
>> $ export ZAP=42
>> $ echo ${ZAP=$(echo potato >&2)then}
>> 42
>> $ echo $ZAP
>> 42
>> $ unset ZAP
>> $ echo ${ZAP=$(echo potato >&2)then}
>> potato
>> then
>> $ echo $ZAP
>> then
>>
>> Means I can theoretically go ${VERSION:=$(git describe)} without the command
>> getting run when it isn't needed, but right now bash gets that right and toysh
>> doesn't yet...