<html><head></head><body>   <div dir="auto"><span style="color: var(--text-color); background: var(--bg-color);">On Thu, Apr 11, 2024 at 03:37, Jarno Mäkipää <jmakip87@gmail.com</span><span style="color: var(--text-color); background: var(--bg-color);">> wrote:</span><br></div><blockquote type="cite" class="protonmail_quote" dir="auto">  there is slight difference between wctoutf8 and wcrtomb, wcrtomb<br>returns -1 if its presented with non valid char, of its char is not<br>presentable on current locale. I think wctoutf8 only returns positive<br>integers.</blockquote><div dir="auto">wctouf8 cannot fail because it writes invalid Unicode code points as utf8.</div><div dir="auto"><br></div><div dir="auto">This is another reason I asked if we could delegate the job of "Is this a valid Unicode code point" to the other Unicode code. We are not reading Unicode with utf8towc, we are reading utf8, if unicode ever gets replaced, it’s not hard to imagine that new/different encoding system representing itself with utf8 (a very elegant, efficient way to represent this type of stuff). As long as there isn’t a security problem to it, it only makes the code less agnostic where it doesn’t really need to be.</div><div dir="auto"><span style="color: var(--text-color); background: var(--bg-color);" dir="auto"><br></span></div><div dir="auto"><span style="color: var(--text-color); background: var(--bg-color);" dir="auto">I remember from testing if you pass in max unsigned int to wctoutf8, it will write one 0xff character, which is actual invalid utf8 (the theoretical max codepoint in utf8 is 2^31-1). This is a situation where bounds checking seems sane, maybe a "if (wc > 1<<31-1) return -1" at the start of wctoutf8 would fix it?</span><br></div><div dir="auto"><span style="color: var(--text-color); background: var(--bg-color);" dir="auto"><br></span></div><div dir="auto"><span style="color: var(--text-color); background: var(--bg-color);" dir="auto">- Oliver Webb <aquahobbyist@proton.me></span></div></body></html>