<div dir="ltr">are you sure? remember that if a test is currently checked in [and Android uses the toy; my test runner reports but ignores failures for tests where readlink says it's not actually toybox], that means that the tests pass on Android.<div><br></div><div>there's even a CTS test:</div><div>```</div><div>TEST(wchar, wcwidth_non_spacing_and_enclosing_marks_and_format) {<br> if (!have_dl()) return;<br><br> EXPECT_EQ(0, wcwidth(0x0300)); // Combining grave.<br> EXPECT_EQ(0, wcwidth(0x20dd)); // Combining enclosing circle.<br> EXPECT_EQ(0, wcwidth(0x00ad)); // Soft hyphen (SHY).<br> EXPECT_EQ(0, wcwidth(0x200b)); // Zero width space.<br>}</div><div>```</div><div><br></div><div>my guess is that you're using a statically-linked binary? bionic doesn't have a "static libdl", so when it tries to dlopen() icu4c to handle an i18n question, that'll fail and in most cases bionic will fall back to "what do i know about ASCII?" but otherwise report failure. (that's what the first line of the test is checking too --- "if we're the static version of the tests, skip this test because this isn't available".)</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 18, 2021 at 11:39 PM Rob Landley <<a href="mailto:rob@landley.net">rob@landley.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Bionic's wcwidth() returns -1 (error) for combining characters, where glibc and<br>
musl return 0 (does not increase the collective width of the displayed<br>
characters). This means crunch_str() can't measure the length of the output, so<br>
cut -C behaves like cut -c.<br>
<br>
I admit the man page is written a bit confusingly, but combining characters are<br>
technically printable, and therefore should have a length of at least zero.<br>
<br>
Rob<br>
<br>
P.S. This whole area is funky because the single dumbest thing about unicode is<br>
that combining characters go _after_ the character they combine with, meaning<br>
you can never tell when you've finished parsing a character until you've gone<br>
PAST it and parsed a character that's does NOT attach to this one. Plus whenever<br>
you get short input (typing, serial input, etc) your terminal keeps rewriting<br>
the same character over and over every time it get a new combining character<br>
that changes how the last character should render. If the combining characters<br>
came BEFORE the non-combining character, the non-zero length character would<br>
flush all the pending combining characters and you'd draw the resulting glyph<br>
ONCE. But alas, Microsoft was on the unicode committee.<br>
</blockquote></div>