[Toybox] locale support question

Ray Gardner raygard at gmail.com
Sat Nov 30 09:28:14 PST 2024


Toybox main.c has this code to support UTF-8:

    // Try user's locale, but if that isn't UTF-8 merge in a UTF-8 locale's
    // character type data. (Fall back to en_US for MacOS.)
    setlocale(LC_CTYPE, "");
    if (strcmp("UTF-8", nl_langinfo(CODESET)))
      uselocale(newlocale(LC_CTYPE_MASK, "C.UTF-8", 0) ? :
        newlocale(LC_CTYPE_MASK, "en_US.UTF-8", 0));

For a standalone version of awk, I intend to use this instead:

  char *p = setlocale(LC_CTYPE, "");
  if (!p || !strstr(p, "UTF-8")) p = setlocale(LC_CTYPE, "C.UTF-8");
  if (!p || !strstr(p, "UTF-8")) p = setlocale(LC_CTYPE, "en_US.UTF-8");

Rationale is that this compiles on older systems that lack up to date
locale support.

What will be the effective difference between these? I am not familiar
with the details of locale support in C and POSIX.


More information about the Toybox mailing list