[Toybox] utf8 display question.

Rob Landley rob at landley.net
Sat Oct 28 20:58:36 PDT 2017


On 10/26/2017 11:31 PM, scsijon wrote:
> Not just japanese, most kanji is usually double-width, some abjad (think
> arabic for simplicity) and a few odd others also use a mix of single and
> double width characters. There is also a few that use half-width and
> single with mixed and some even have tripple-width to contend with.

That part I knew. My question is about an aesthetic design issue, what
should "cut by columns" do when asked to cut inside a column?
Left-aligning to whole characters is consistent, but doesn't guarantee
your output fits in a given space.

There are basically conflicting guarantees you can't satisfy all of
simultaneously here. If you really care you can cut out a field and then
cut it again starting at 1, because the second cut will round down
within the field. Ala:

  cut -C 11-20 | cut -C 1-10

(The 1 skips nothing so can't expand, and the 10 rounds down...)

So that's what I went with. Fixing the rest of it now. (posix says "cut
-f" should display the line if the delimiter isn't encountered, -F needs
tests, -F shouldn't abort for a zero length match but probably treat it
as "no delimiter"...)

Rob



More information about the Toybox mailing list