[Toybox] [NEW TOY] iconv (was [PATCH] roadmap: describe glibc commands)

Rob Landley rob at landley.net
Sun Apr 13 13:53:46 PDT 2014


On 04/13/14 04:37, Felix Janda wrote:
> Isaac Dunham wrote:
> [..]
>> locale and iconv were already triaged. 
> [..]
> 
> iconv is actually something I'm missing on my current musl based system.
> Attached is a simple version using the libc's iconv.

Small, simple command: lemme do the cleanup now so it doesn't build up.

A global outside GLOBALS. I was about to do a "void *" pointing to an
instance declared in main's local variables, but... iconv_t pretty much
has to be a "long", because A) it's typecastable to -1 (so it ain't a
structure), B) it's returned by a function that allocates it. So either
the pointer (which would be the same size as long according to LP64), or
an index into an object table (which has no reason to be _larger_ than a
long).

(Ok, it could technically be a "long long" on a 32 bit platform, or it
could be floating point, but that's _insane_ and to quote Granny
Weatherwax, "I can't be having with this." They do that, they get to
keep the pieces.)

Let's see: in glibc it's a void *. In uClibc it's a void *. In musl it's
a void *. In freebsd it's a void *. Yeah, I'm calling it: we know what
the type actually is.

(I could also just add #include <iconv.h> to toybox.h...)

Ok, let's see... do_iconv():

We can't guarantee errno != EINVAL on the way in... but the first time
through we're guaranteed inleft is 0 so it doesn't matter. (Subtle, but
correct.)

out is always toybuf+2048, so we don't need to calculate toybuf+2048
again, or set out a second time... No wait, it's reset by iconv. Huh...

Really the only interesting errno case from iconv is illegal sequence.
The rest just say "ran out of input" or "ran out of output" which is
what you expect from a conversion that's not at the end of the file yet.
(Ok, truncated sequence is a synonym for illegal sequence if we're not
at the end of the buffer, which we can special case as at the _start_ of
the buffer with the memmove logic.)

Hmmm... we should probably pass illegal sequence bytes through. (Pass
'em through.) Except check if output buffer is full before doing that.
(Don't have to check inleft nonzero because if inconv() returns illegal
sequence but used up all the input buffer, that's a libc bug.)

memmove() with length 0 isn't an error, is it? Ok.

Where would I get a test file to convert? I just ran a text file through
it and confirmed it's not making any changes to it, but that doesn't
mean much. :)

(Sorry, rewrote it a bit more than I expected to. Checking in now...)

Rob

P.S. Posix iconv has several more command line options. -c is easy and
-s is NOP for us, but I dunno how to do -l.

 1397422426.0


More information about the Toybox mailing list