[Toybox] vi 'b' command broken
enh
enh at google.com
Thu Jan 4 16:10:58 PST 2024
On Tue, Jan 2, 2024 at 4:28 PM Rob Landley <rob at landley.net> wrote:
>
> On 1/2/24 17:21, enh wrote:
> >> > if you really care, not even icu4c (my usual answer to such
> >> > questions, and something bionic regularly forwards such questions to),
> >> > you want to talk to something like
> >> > https://en.wikipedia.org/wiki/HarfBuzz instead --- this shit gets
> >> > weird, fast.
> >>
> >> Yes, but that's not really the question I'm asking.
> >
> > no, but it's the question you actually _need_ to ask if you're worried
> > about doing something _useful_
>
> I'm worried about implementing unicode-aware interactive line editing for toysh,
> which may someday get retrofitted onto the vi implementation but for now that's
> not my problem.
>
> The way I _thought_ fold worked is how line editing has to work: backspace
> undoes the previous character, including jumping back to the start of variable
> width tabs, so I've got to checkpoint the previous position for backspace to
> return to.
>
> There are various horrible alternatives, including send the ansi position query
> after every keystroke or jumping to the left edge and rewriting the entire line
> each time with "clear to end of line" sequence at the end, but I'd rather use a
> solution that ISN'T crazy.
>
> > --- it's probably better to think of
> > some scripts as "nothing but combining characters".
>
> Then what do they combine _with_?
https://github.com/n8willis/opentype-shaping-documents/blob/master/opentype-shaping-arabic.md
> I tried putting an umlaut on low ascii characters. It didn't even work with "tab"...
>
> >> How often do new unicode
> >> tables come out and do they ever really make big changes?
> >
> > "about one/year" [citation needed?
> > https://en.wikipedia.org/wiki/Unicode#Versions]
> >
> >> There are only 1.1
> >> million possible values, this is not a big table of numbers in a modern
> >> computing context, and there presumably ARE answers?
> >
> > my point is that it's the _combinations_ that are interesting. that's
> > why i mentioned harfbuzz.
> > https://harfbuzz.github.io/why-do-i-need-a-shaping-engine.html is a
> > good high-level intro (the paragraph containing the word "arabic" in
> > particular).
>
> Um... if combining characters change the width of the base character, I think
> I'm just plain gonna get the fontmetrics wrong there. I don't see how I can
> avoid it.
>
> >> Anyway, why is this NOT a couple bitmaps for 0 and 1 and an if/else staircase
> >> for oddballs, else size 2. I'm aware the xfce terminal isn't exactly cannonical,
> >> and maybe it's printing something when it shouldn't, but this is the question
> >> I'm trying to ask with wcwidth(). When I print this, how many columns does that
> >> consume on the terminal? It's giving a width to these characters.
> >
> > (see the harfbuzz documentation for why "character width" isn't a
> > meaningful concept for all the world's scripts :-) )
>
> Then I can't support all the world's scripts.
>
> The perfect is the enemy of the good. I want to figure out the subset I _can_
> support. And right now, it's not handling japanese.
>
> If I have to make simplifying assumptions, then "low ascii is weird", and every
> other unicode codepoint is either 0, 1, or 2 characters, and maybe I need to
> handle the right to left direction switching codepoints but I'm not entirely
> sure how.
>
> It sounds like getting this perfect is a full-time job for a dedicated domain
> expert, and even they can't package it up in a useful fashion so people who
> AREN'T domain experts can ask simple questions that get answers. (If the unicode
> consortium produced a mess that goes non-euclidian in places, I only have so
> much brain to try to understand the results with.)
right, but then i'm back to "why don't you just trust wcwidth() and
move on with your life?" :-)
isn't that all the competition is doing? (i actually have no idea ---
i don't speak any rtl languages, so korean is the most exotic thing
i've ever done at the prompt, and that's not really any more
complicated than german in this sense.)
> Rob
More information about the Toybox
mailing list