[Toybox] Does anyone here understand how unicode combining characters work?
Rob Landley
rob at landley.net
Thu Sep 27 07:10:04 PDT 2018
On 09/27/2018 08:53 AM, Rob Landley wrote:
> The low-ascii stuff is not related to unicode, yes. But it got swept up in the
> unicode changes and behavior changed when unicode support went in. And
> unfortunately, terminal programs differ and the Linux ctrl-alt-f1 text mode
> terminals differ from the xterms. Haven't tried a frame buffer yet...)
>
> For example, when I do echo -e '\x02\x02\x03\x04x' on xfce xterm, I get 4 square
> boxes with digits in (I.E. uni-codepoint has no glyph, doo dah, doo dah)
> followed by x. But ctrl-alt-f1 text mode prints nothing and does not advance the
> cursor either, I just get the x on the first column. (I even tried "export
> TERM=linux" in both and it didn't change the behavior, that's orthogonal.)
>
> Hence filtering some of them out and not printing them if I dunno whether
> they'll advance the cursor or not.
P.S. I've got this commented out not to self in my local tests/ls.test:
echo -e "$(X=0;while [ $X -lt 255 ];do X=$(($X+1));[ $X -eq 47 ]&&
continue;printf '\\x%02x' $X; done)"
Which I think was meant to create a torture test for ls -b display mode? Ala
touch "$(that)" in an empty directory and ls -b it.
That says on this xterm, outputting ascii 0 doesnt' display, 1-4 are boxes, 5 is
ignored, 6 is a box, 7-f aren't boxes but there's two a couple line breaks in
there (\b, \t, \r, and \n live in that range, then 0x10 through 1f are boxes again).
Meanwhile, in Linux text mode the first non-space character printed is ! and if
I add an 'x' after the character printed each time it's:
xxxxxxx x
x
x
x|xxxxxxxxxxxxxxx x!x[and so on]
(Which is confused by \b and \r taking effect, but why is there's a pipe after
ascii 16???)
> Going down ratholes most people never noticed the existence of, as usual.
Continuing down said rathole...
(I'm pretty sure "faking the linux VGA text mode behavior for low ascii
characters" is as close to 'a standard" as we're likely to get here.)
Rob
More information about the Toybox
mailing list