[Toybox] Fun with sed.

Rob Landley rob at landley.net
Tue Nov 4 19:01:03 PST 2014


Checked in the next round of sed updates. Sorry that took so long, the number
of different ways remalloc() can cause subtle memory corruption is actually
kind of impressive. There were two particularly annoying ones here.

Backstory: the sed script parsing creates a doubly linked list of struct step *
which I've implemented as a single block of memory per sed command, even
thoguh it contains various optional allocations: multiple regmatch_t
instances that may or may not be there, abritrary length text strings, and
so on.

So what I did was have not just a pointer to the start of the structure
(mapped over toybuf), and then a second pointer to the _end_ of the structure.
And when I was done I could xmalloc() a new instance using end-start bytes
(pointer subtraction produces a long offset in sizeof(*pointer) units,
typecast them both to char * to get bytes), memcpy into it, add to the
list, rinse repeat.

Problem: the strings for things like s//target/ are arbitrary length, I
don't want to limit the size.

Solution: do the copy early, and then remalloc() as I append extra data
(which not all commands need).

Problem 1: remember that end pointer? When the start * moves due to remalloc,
I need to recalculate the end pointer. (Determine distance from old start,
add distance from new start.)

Problem 2: when I did the copy early, I added the entry to the list. But
if you remalloc an entry already in the list, the prev and next pointers
in the existing list entries point to the OLD entry.
Solution: defer adding the entry to the list, and if I need to do a multiline
continuation remove the old entry from the list, modify it locally, and then
add it back to the list.

All of the above sounds simple. It took me FIVE DAYS to diagnose the segfaults
(which happened _much_ later in the program, on things like fclose() that
had nothing to do with the problem), debug it, and fix it.

Now that I've got it WORKING I can probably simplify the design some more.
But in the meantime, I've checked in what I've got. Might try to finish
the feature set first and then go back and optimize the option parsing...

Rob

 1415156463.0


More information about the Toybox mailing list