[Toybox] wc -L

Rob Landley rob at landley.net
Mon Feb 5 11:02:04 PST 2024


It was on my todo list because
http://lists.landley.net/pipermail/toybox-landley.net/2023-November/029887.html
but the patch just did a basic "length++" count of characters, which I thought
was all that was necessary until I started in on the test cases...

Rob

On 2/5/24 11:33, enh wrote:
> huh, i'd never heard of this (and it's not used in any of the code i
> have access to), but debian shows it's used a bit:
> https://codesearch.debian.net/search?q=wc%5C+-L&literal=0
> 
> On Mon, Feb 5, 2024 at 9:09 AM Rob Landley <rob at landley.net> wrote:
>>
>> Who was it who asked for wc -L again? Because here's what the debian version is
>> doing:
>>
>> $ echo a | wc -L
>> 1
>> $ echo -n a | wc -L
>> 1
>> $ echo -e '\ta' | wc -L
>> 9
>> $ echo -e '\t\b' | wc -L
>> 8
>> $ echo -e '\t\b\bx' | wc -L
>> 9
>> $ echo -e '\t\b\b\b' | wc -L
>> 8
>> $ echo -e 'abc\td' | wc -L
>> 9
>> $ echo -e 'abc\bd'
>> abd
>> $ echo -e 'abc\bd' | wc -L
>> 4
>> $ echo -e '\x01' | wc -L
>> 0
>> $ echo -e 'w\xcc\x88'
>>>> $ echo -e 'w\xcc\x88' | wc -L
>> 1
>> $ wc -m tests/files/utf8/japan.txt
>> 25 tests/files/utf8/japan.txt
>> $ wc -L tests/files/utf8/japan.txt
>> 50 tests/files/utf8/japan.txt
>> $ wc -c tests/files/utf8/japan.txt
>> 75 tests/files/utf8/japan.txt
>>
>> So wc -L isn't QUITE the fold.c logic, because it treats backspace like any
>> other low ascii character (I.E. width zero). But otherwise it does tabs and
>> measures unicode wide and combining characters.
>>
>> Alas, my first naive implementation... didn't do all that yet.
>>
>> Rob
>> _______________________________________________
>> Toybox mailing list
>> Toybox at lists.landley.net
>> http://lists.landley.net/listinfo.cgi/toybox-landley.net


More information about the Toybox mailing list