[Toybox] [PATCH] hexedit: various improvements.

Rob Landley rob at landley.net
Tue Apr 20 02:27:03 PDT 2021


On 4/19/21 4:22 PM, enh via Toybox wrote:
> I've been using hexedit quite a lot, mainly for _corrupting_ files, and
> have been meaning to send this collection of changes for far too long
> now. I saw a bug requesting editing in the ASCII pane (which this patch
> _doesn't_ add), and wanted to get this sent in before it has to undergo
> the third massive merge conflict of its existence...

You can tell I never did the "remove sleep deprivation artifacts" polishing pass
on this command, it still had "char broiled;" in it...

This is more changes than I have time to go over with a fine tooth comb right
now, but A) I trust you, B) you're the one driving new feature additions. (I was
happy with the old one, you're not, so I should get out of the way.)

Applied.

> The main "TODO" in this is that I never got round to implementing
> searching for an arbitrary byte sequence. It seems like we ought to have
> that feature, but personally I'm far more likely to jump to an offset or
> to search for some ASCII. I haven't needed to search for arbitrary byte
> sequences in all this time, so I'll fix this if/when I actually need
> it...

I might take a stab at it, but don't let that "maybe" stop you if you feel inspired.

> * Enter (new) read-only mode rather than refusing to open read-only
>   files.
> 
> * More keys: page up/page down, home/end, and ctrl-home/ctrl-end for
>   beginning/end of file.
> 
> * Jump with ^J (or vi-like :). Enter absolute address or +12 or -40 for
>   relative jumps.
> 
> * Find with ^F (or vi-like /). No support for bytes, but useful for
>   finding text. (^G or n for next match, ^D or p for previous match.)
> 
> * Support all the usual suspects for "quit": vi-like q, desktop-like ^Q,
>   panic ^C, or even plain old Esc.

Sure.

> * The ASCII pane is made more readable by (hopefully) reasonable use of
>   color.

When editing a primarily text file (which comes up a lot when I'm trying to
figure out if something is a space or a tab, for example), the red J for
newlines are REALLY ugly to me.

The commodore 64 (yes I am that old) used to naturally display nonprintable
characters in reverse video when they were quoted, which is what I implemented.
It's quite likely I'm biased by that, but it makes each displayed character unique.

(The red you've chosen is also _dark_, which makes it harder for me to see...)

> Regular control characters are shown in red using the
>   appropriate letter (so a red A is 0x01, etc),

Sigh. Not an aesthetic decision that appeals to me, but bikeshedding over color
is not a clear win for anybody.

> printable characters are
>   shown normally, and top-bit set characters are just shown as a purple
>   question mark (since I couldn't come up with a better representation
>   that had any obvious value

The previous representation had a unique display for each character.

> --- in my experience top-bit set characters
>   are either meaningless in ASCII, part of a UTF-8 sequence in modern
>   files, or in some random code page in ancient files).

It wasn't "correct", it was "unique".

Those of us who've memorized the ascii table (toybox has an "ascii" command for
a reason) could actually work out at a glance what each value is. (Well, once
we've been staring at hex dumps for a bit, anyway. You can recognize the same
3-character sequence easily enough when finding multiple instances of it...)

Zero through 31 was 64 through 95 shifted down and recolored, and 128 through
255 were the bottom 128 entries shifted _up_ and recolored. The awkward ones
were 127 and 255 (went with reversed space), and the low ascii part of the high
address space (128-159) had been remapped _twice_ and thus had a collison
needing a third color. I was never happy with those _not_ being reversed,
because reversed meant "nonprintable", but I hadn't introduced colors yet and
was out of shades of grey. :P

I understand you want to use foreground color instead and reserve "reversed" for
the cursor. I was reluctant to _require_ color because I dunno what variants of
"colorblind" actually come up these days. (Printing things out on paper is still
usually black and white but I dunno how relevant that is. Actual non-color
displays aren't as big a deal as they used to be...)

Still not entirely happy with it though. :(

> The choice of
>   red and purple was to deliberately make these not-actually-ASCII
>   characters slide into the background; before this patch they have so
>   many bright pixels (especially with the use of reverse video) that I
>   couldn't clearly see the *actual* ASCII content in the ASCII pane.

I wanted them to stand out when looking at a mostly ascii file. When dealing
with a significantly binary file I was mostly off looking at the hex side of the
force.

(I grew up with petascii representation so it's a native tongue to me. I admit
to being biased here. But your chosen representation... has its own issues.)

> * Addresses are now shown in yellow. No real justification other than "it
>   looks nice".

Define "yellow". (It's brown here. Missing a "bright" escape maybe?)

> * NUL bytes in the hex pane are shown dimmed. I find this helpful
>   especially when there's a lot of padding, and it can actually be a
>   useful clue when reverse engineering (you can "see" repeated patterns
>   more easily), but I can understand if this one's controversial.

I don't think we're going to get a representation that satisfies everybody.
Possibly there should be a command line flag or something?

> * Errors are shown "vim style" in bold white text on a red background,
>   waiting briefly to ensure they're seen.

A bit _too_ briefly for my tastes. Can we wait until they hit a key, maybe?

> * The status bar shows the filename, whether the file is opened
>   read-only, the current offset into the file, and the total
>   length of the file.

Way back when, I used the hexedit filename display line to test the utf8 string
measuring plumbing I was creating in lib/tty.c (before doing tar), I wonder if
it's still getting that right...

  $ mkdir sub5
  $ echo hello world > sub5/"$(cat tests/files/utf8/arabic.txt)"
  $ ./hexedit sub5/*

Not... exactly. Could be worse, though.

> * SIGWINCH handling has been added.

A distinct improvement, yes. Thanks.

Rob


More information about the Toybox mailing list