[Toybox] [CLEANUP] uuencode.c, pass 1, base64

Thu Apr 11 20:07:01 PDT 2013

Recently I did uuencode cleanup. First let's read through the  
unmodified file:

   http://landley.net/hg/toybox/file/829/toys/pending/uuencode.c

This shows us the following functions:

   static void uuencode_b64_3bytes(char *out, const char *in, int bytes)
   static void uuencode_b64_line(char *out, const char *in, int len)
   static void uuencode_b64(int fd, const char *name)
   static void uuencode_uu_3bytes(char *out, const char *in)
   static void uuencode_uu_line(char *out, const char *in, int len)
   static void uuencode_uu(int fd, const char *name)void  
uuencode_main(void)
   void uuencode_main(void)

The main() function calls either uuencode_uu or uuencode_b64 (depending  
on whether or not it got the -m option). The encode function reads  
chunks of data and calls the corresponding encode_line() function to  
output a line of encoded text in the right format, and the line  
function calls the corresponding encode_3bytes() to turn 3 bytes of  
8-bit input into 4 characters of appropriately encoded 6-bit output.

The first round of cleanup was commit 830:

   http://landley.net/hg/toybox/rev/830

The first hunk tightens up the help text. I have a fairly standard  
format for help text: usage line, text description of what it does,  
options one per line with a tab between the option and the description.  
Someday I hope to write a help text parser that can collate subfeatures  
(like "cp" has), and regular help text helps parse it so I can combine  
sections.

Next was a uuencode_b64_3bytes() function. This takes up to 3 bytes of  
input and outputs 4 bytes of base64. (Given 2 bytes of input, it  
outputs 3 bytes and an equals. Given 1 byte, it outputs 2 bytes and two  
equals.) This is completely loop unrolled, which used to be an  
optimization strategy back before processors started running a dozen  
times the speed of their own memory so tight loops that fit in a single  
cache line trumped quick-to-execute code that spanned multiple L1 cache  
lines. (Ballpark cache line size is in the 32-128 bytes range.  
According to /proc/cpuinfo "clflush size" on my netbook is 64 bytes.  
That's the granularity with which most memory transactions actually  
take place in this processor. If you'd like to learn this stuff go to  
http://kernel.org/doc and look for the links to 'ars technica ram  
guide'.)

This b64_3bytes() function took an output buffer as one of its  
arguments, but the output always goes to stdout, so I just wrote to  
stdout in the function here (trusting the FILE * to have an internal  
buffer to collate output if it matters, but uuencode isn't hugely  
performance critical anyway. Actually I think xputc() might have a  
fflush in it, but I mentioned it's not performance critical.)

The function had a loop to read/shift the data into an integer, and  
then 4 lines to store each byte into the output buffer, and two tests  
to overwrite the last two bytes if the length is short.

I replaced this with one for loop that iterated 4 times (for the 4  
output bytes). Each time through, if there's still input data it's  
read/shifted into the input integer, and then it writes either a byte  
of data or an = depending on how many bytes of input we had and where  
we are in the loop. The loop iterates 4 times because we always produce  
4 bytes of output, even for short input (which only happens at the end  
of a file).

There was a static table[] of value to character mappings, kept in a  
constant string. I instead had uuencode_main() generate that in toybuf.  
(I lean towards generating things instead of storing them statically so  
you can see where they came from. Sometimes I can't, for example  
toys/lsb/md5sum has the static md5table but starts with a comment that  
says if I was willing to pull in floating point and libm I'd calculate  
it via for(i=0; i<64; i++) md5table[i] = abs(sin(i+1))*(1<<32);  
Similarly, crc_init generates the crc32 table, for both endiannesses. I  
should probably have a comment that the magic constant for  
little_endian is the same as the big_endian one just bit-reversed.)

Next function: uuencode_b64_line() is a wrapper function that produces  
a line of output, by calling uuencode_b64_3bytes an appropriate number  
of times. Just minor cleanups here for now: no need to pass along an  
output buffer the b64_3bytes() function no longer uses, and no need to  
print the contents when b64_3bytes() does it itself.

So now we get to uuencode_b64(), which is using toybuf. Let's stop and  
trace that now-removed output buffer. Only 4 bytes of out[] were ever  
filled in by b64_3bytes(), only 4 bytes were printed by b64_line(), but  
in uuencode_b64() outbuf was given 64 bytes. (This is why you rework  
things until they're right next to each other and you can spot this  
sort of thing. By spreading it across 3 functions, the mismatch wasn't  
easily noticeable.)

So that can go, but we still have a loop reading lines of data, and  
thus we still need a buffer to read a line of data into. (Reading a  
line is faster than reading 4 bytes, and there's per-output-line  
processing, namely writing a newline, so we care when lines end anyway.  
So line at a time is a logical input block size.)

The old line size was hardwired at 48 bytes of input data. (You can't  
tell when inbuf is declared, but the read is 48 bytes.) The uuencode  
spec actually says lines can be longer than that, specifically:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/uuencode.html

> The output stream (encoded bytes) shall be represented in lines of no  
> more than 76
> characters each.

This is a tiny enough buffer that declaring it on the stack is trivial  
(saving toybuf for other uses), and it doesn't persist past this  
function call so there's no advantage to it being global. I encoded  
what the standard says into my buffer size declaration as char  
buf[(76/4)*3]; (The compiler will resolve the constant math at compile  
time, and meanwhile it explains where it came from. Enough 4-byte  
chunks of output to total 76, and then 3 bytes of input read in each  
chunk. That's our read buffer size. I can then sizeof() that in the  
actual read, and because it's an array of char I don't have to say  
*sizeof(char) because I know that's 1.

Note that the if (len > 0) dropped out here, because b64_line() has  
while (len > 0) so that test is already performed in the function we  
call.

So that's base64 encoding, and probably enough for one message. :)

Rob